DOGUMENTC RESUME 



ED 223 680 

Aui-HOR 
TITLE 

INSTITUTION 
SPONS AGENCY, 
pUB DATE 
GRANT ' • 
NOTE 

PUB TYPE 

EDRS -PRICE 
DESCRIPTORS 



' ^ TM 820 809 

t 

punivant, Noe,l 

The Effects ot Measurement Error on Statistical 
Models for Analyzing Change. Final Report i 
New York Univ. , N'.Y. 

National Inst, of Education (ED) , Washington, DC. 
[81] 

NIE-G-78-0071 
1851^. ' 

Reports - Research/Technical (143) 

MF01/E^G08 Plus Postage. , \^ ^ 

Algorithms^ Analysis of Covariance; . *Change; *Ef^ri^ 
' of Measurement; *Mathematical Models; Performance 
Factors; *^sychometrics ; Regression (Statistics) ; 
• Research Design; *Research Methodology; *Statist^ical 
Analysis; Statistical Bias 
IDENTIFIERS *Change Analysis; Linear Models 

I ' 

ABSTRACT ^ ^ • ' , 

The results of six major projects are discussed 

includ 
the pro 

assessing change. In a g„ 

several new analytic results are proved concerning the parameters 
which affect bias in observed-score r-agression statistics. The bias 
in ordinary least squares estimators is expressed as a function of 
' cQvariances among true scores, among the measurem'ent errors, and 
sample size. The first two projects were employed to create an 
algorithm for assessing the potential bias due to the unreliability 
of measures. The algorithm was implemented as a FORTRAN program to . 
improve the d6sign of investigations of change and minimize potential 
__eWairs— o±— iirf etfence. A review ,is presenteid of statistical methods 
which have been ^ developed in several: disciplines to estimate the 
parameters of true xhantfe by correcting the observed-score regression 
estimates for unreliability. Aeries, of Monte Carlo experiments 
which evaluated the performance of the methodfe are discussed. The 
advantages and general superiority "of estimators Sproposed by Fuller 
are examined. The relevance of a special linea\r. functional relation 
(LFR) model and models devised for estimating the parameters of LFRs 
are compared. (Author/CM) . " \ 




************************************************************ 

* Reproductions 'supplied by EPRS are 'the best that can be made 

from the original document. 
it***it*******i*******i!********* **************** *********************** 




THE EFFECTS OF MEASUREMENT ERROR ON 
STATISTICAL MODELS FOR ANALYZING CH253nGE 



U.S. DEPARTMENT OF EDUCATION 
NATIONAL INSTITUTE OF EDUCATION 

EDUCATIONAL RESOURCES INFORMATION 
^ CENTtR (ERIC) 

?^ Th»s document has b^«n reproduced as 

received from ,ho person or organi/ation 

o>»ginaiino ii 

Minor changes hav« been made to miprovo 
'reprodiicnon quality. 



► Points of view or opinions stated in thisdocu 
ment do not necessarily represant official NIE 
position or policy 



Fitnal Report 
Grant NIE-G-78-0071 
New York University 



Noel Dunlvant-^ 
Principal Investlgat9r 



This research vas suppc/rted by Grant NIE-G-78-0071 from the National 
Institute of Educ;\tlori, U.s|. Department of Education. However, points of 
.view or oplrtlons Ao not rei.|i^aent official NIE position or policy/* 

%ow at the Nstional Center for State Coulrts., 300 Newport Avenue, 
Williamsburg, Virginia* 



CONTENTS 



Chapter ' Page 

ABSTRACT . \ 1 

INTRODUCTION AND EXECUTIVE SUMMARY- ' ' / 

OVERVIEW, . \ • I-l ■ 

CHAPTER ir 1-3 

'CHAPTER III 1-5 

CHAPTER IV . . . . ' 1-7 

CHAPTER V ' 1-9 

CHAPTER VI^ I-ll 

, CHAPTER VII ' 1-13 

II. PROBI^EMS CAUSED. BY- MEASUREMENT ERROR 'IN ANALYZING CHA|GE 

' \ 

: . INTRODUCTION . . . . . II-l 

GENERAL STRUCTURAL MODEL OF CHANGE ■ . . 11-2—*- 

? PROOF" OF BIAS AND INCONSISTENCY II-6, 

The Single-Predictor Case II-7 • 

The TT*o-Vari able. Case II-9 

The General Case - . . . 11-12 

EFFECTS OF MEASUREMENT ERROR ON OTHER .ESTIMATES .... ^-14 
DEMONSTRATIONS OF BIAS CAUSED B^ ERRORS 0F~ 

-MEASUREMENT 11-16' 

I REGRESSION VERSUS STRUCTURAL COEFFICIENTS ....... II-21'- 

I IDENTIFICATION REQUIREMENTS 11-24 

( LINEARITY CONDITIONS 11-26 ^ 

SUMMARY • • •' l'^-28 

FOOTNOTES ' li '30 

TABLES 11-31 

FIGURES ^ . ' 11-33 

III. REVIEW OF METHODS THAT CORRECT FOR ERRORS OF MEASUREMENTS 

INTRODUCTION . ." III-l 

SPEARMAN'S CORRECTION FOR ATTENUATION III-2 

STOUFFER-LINDLEY METHOD ... III-6 

STROUD'S METHOD III-13 

FULLER'S -METHODS III-14 

PORTER'S METHOD . III-19 

APPLICATIONS OF METHODS TO REAL DATA III-22 

CONCLUSION ' III-26 

FOOTNOTES ...... .\ » III-28 



> 



-1 



CONTENTS, (continued) . ... 

Chapter ' ' ^ , ' ^ i 

IV. ESTIMATION OF LINEAR FUNC^^ONAL RELATIONS • 

INTRODUCTION * ' 

DEFINITION OF fQUIVALENtE AND CONGENERIC TESTS ..... IV-5 

' DATA ORGiiNIZATION AND KOTATON ,• • •- . 

REVIEW OF METHODS FOR DETERMINING EQUIVALENCE ..... IV-13 

Procedures Using Replications • * ' -IV-IA 

Lord-Villegas Test . . I^"?^ 

^ • Kristof's Test - • • • • • 1^-^; 

Jereskog's test IV-21 

Procedures Using Error Viiriantes . . . • iys-27 

Koopmans-Tintner Method . ^ IV-27 

Fuller's Test . . IV-30 

JiJreskog's 'Model' ... i .... , lv-31 

* . procedures Using Reliabilities . ^^"^^ ^ 

FullerjHidiroglou^Test \« • • • • IV-32 

REFERENCES • • • • • , ' 

FOOTNOTES . . ; . . . IV-40 • 

TABLES IV-41 

FIGURES • • J-v 

V. SOME ANALYTIC RESULTS FOR PARAMETERS AFFECTING BIAS IN 
GOODNESS^^P FIT AND SAMPLING DISTRIBUTION STATISTICS 

INTRODUCTION ' * ^j'i 

. EXPRESSIONS FOR TRUE-SCORE' PARAMETERS . . • V-2 , 

8^2 for True Change ' 

r2 for True Change • ^"^ 

£ bb for True Change ^"^ 

EXPRE^IONS FOR THE 'OBSERVED-SCORE PARAMETERS V-6 

Sgf^ Observed Change ... ^ ... • ^"^^ 

r2' for Observed Change 

S^b'b' Observed Change ^ V-9 



|er!c 



CONTENTS (continued) . . ^ 

4r 



Chapter - Page 

PARAMETERS AFFECTING BIAS IN OBSERVED-SCORE REGRESSION *^ 
STATISTICS . • ; V-9 



Parameters Affectliig Bias In s^'^ V-10 

'Parameters Affecting Bias In . . . v-12 

^ Parameters Affecting Bias In S^ tf^f v-14 

CONCLUSION 77 \ . V-15 



VI. "AN ALGORITHM FOR ASSESSING BIAS IN PLANNED STUDIES 

OF CHANGE ^ - . 

^ INTRODUCTION ^ ^. . VI-1 

THE AI^GORITHM . . . VI-2 

EXAMPLE .... '.^ ^ ... / VI-4 

FORTRAN COMPUTER PROGIrAM ' ' VI-7- ^ 

, AN APPLICATION VI-11 

CONCLUSION . . . . . . . VI-13 , 

VII. MONTE CARLO COMPARISONS OF STATISTICAL METHODS THAT CORRECT 

FOR^ ERRORS OF MEASUREMENT ^ ' V 

Introduction ...)..... vii-i 

REGRESSION CORRECTION METHODS' VII- 

Stouf fer-Lindley fLthad VII- 

Warren, Whl<e and Fuller Method VII- 

r. • ■ • . 

ANCOVA CORRECTION METIODS VII- 

Cohen and Cohen Method * ^VH- 

Potter's^ Method VII- 

^DeGracle afid Puller Method ^ . VII- 

EXPERIMENT 1: REGRESSION SIMULATIONS VII- 

^ Design ij ....... . VII- 

Data ^Generation Methods ...... ..^ . . -^VII- 

Estimators and Performance Evaluation VII- 

Results VII- 

Conclusions . VII- 

EXPERIMENT 2: ANCOVA SIMULATIONS ^ . . . VII- ' 

Design VII- 

Data Generation Methods. VII- 



ERIC 



CONTENTS (continued) v 

Chapter / Page 

Estimators and Performance Evaluation • • • v • • • VII- 

Results • . . . Vllr . 

^^ J Conclusions , . VII- 

DISCUSSION VII- 

. TABLES . . ! . VII- 

REFERENCES " '"' ,' R- 1 

APPENDICES ' ■ ' A- 1 



/ 



V 



b 



CONTENTS (continued) ' . . 

• ■ •'■ \ " - • " - 

Chapter ^ \ , I . ' ^^8^ 

PARAMETERS AFFECTING BIAS IN OBSERVED-SCORE REGRESSION 

. STATISTICS . . ^ '7-9 

Parameters Affecting Bias In s^i^ . ' . V-10 

Paraineters^ Affecting Bias In R^' ^ V-12' 

Parameteris Affecting Bias In b'b' • t • • • • y-14 

'conclusion f .. V-15 

" VI. AN ALGOiEaTHM FOR ASSESSING BIAS IN PLANNED STUDI^ 
. -OF CHANGE 

, INTRODUCTION . .• , • • " VI-1 

THE ALGORITHM • VI-2' 

" AN .EXAMPLfi '. . . VI-4 

FORTRAN COMPUTER PROGRAM VI-7 , 

Al? APPLICATION . . . VI-11 

CONCLUSION i . . . VI-13 

VII.* MONTE CARLO COMPARISONS OF STATISTICAL METHODS THAT CORRECT 
.FOR ERRORS OF M|:ASUREMENT 

INTRODUCTION Vll-i 

' REGRESSION CORRECTION METHODS , . . . . VII-2 

Stduffer-Undley Method ^ ^/ ^VII-3 

Warren, Whitle and Fuller Methi^d VII-5 

ANCOVA CORRECTION METHODS . . . ;'\ . . . ' VII- 7 

Cohen and Cohen Method . ^ .* VII- 12 

^Porter's Method VII- 14 

^DeGracle and Fuller Method 1 . . . VII- 16 

^ Analytical Comparison of ANCOVA Methods.' ...... VII-19 

EXPERIMENT 1: REGRESSION SIMULATIONS ' VII-31, 

Design • . VII- 31 

Data Generation Methods . T ......... .. • VII- 32 

Estimators and Performahce Evaluation VII- 34 

Results Vri-35 

Conclusions ^ VII-36 

EXPERIMENT 2: ANCOVA 'SIMULATIONS VII- 37 

Design . . . . VII-37 

Data Generation Methods . . . . « VII-->° 



.CONTENTS (continued) 

Chapter " . > ' ' ^ Page 

Estl9iators' and ^Ferf ormance Evaluation' ••••••• VII- ^0 

Results . . . '\ 1 - . . . % . . .... VII- AO 

Concluslonsy r . . • . . VII- Al 

* ^ DISCUSSION^.;-/. .-^ • " VII- 42 

FOOTOOTEs " > * . VII- 45 

TABLES VII- A6 

REFERENCES * > ' ' R- 1 ^ 

APPENDICES » A- 1 



\ 



8 



ABSTRACT 



The final report describes the accomplishments of six major projects 

undertaken_as part of this funded research program. First, a 

comprehensive^mathematical and statistical analysis of the problems 

caused by errors of measuMment . in linear models for assessing change is 

presented. Results from several disciplines^ are integrated, and * their 

implications for .studies of educational change disciissed. Second, a 

general matrix representation of the problem is formulated, and several 

new analytic results are proved concerning the parameters which affect . 

bias in observed-score regression statistics. We derive equations which 

express the bias in OLS estimators as a function of coryariances * among 

the true scores, covariances among the measurement errors, and sample 

size. Third, the results of the first two projects were employed to 

create an algorithm for assessing the potential bias due to the 

unreliability of measures. The algorithm has been implemented in the ' 

farm of a FORTRAN program which can be used by resesgrchers to improve the 

» 

design of ^ nvestigations of change in order to minimize the likelihood of 
potential etrors of Inference. Fourth, we undertake a comprehensive 
review of statistical methods which have been developed iti several ^ 
cliscip lines to estimate the parameters of true change by correcting the 
observed-score regression estimates for unreliability.. The methods are 
formulated in a common algebra and evaluated in terms of bias and power. 
Fifth,* the report describes the results of a series of Monte Carlo ^ 
experiments which evaluated the performanoe of s.everal methods whichr , 
utilize a priori information about the variance structure^of the errors 



of^easurement to estimate the parameters of the true-score^egresslons. 
The advantages and general suj^eriorlty of estlmato^^ proposed by Fuller 
and his colleagues are discussed. Sixth, a special type of model-^the 
linear functiohal relation (lYR) — is discussed in terms of its relevance 
for the study of change. A variety of models which have been devised in 
psychometric s and* econometrics for estimating the parameters of LFRs are 
compared and recommendations about the be st^ methods to use are made. An 
extensive bibliography and computer programs are included as ^appendices. 



■4 

CHAPTER I f 
\ INTRODUCTION MP EXECUTIVE SUMMARY 

•. OVERVIEW 

This report describes the results of a large-scale research program 
on the effects of measurement error on linear statistical models for 
analyzing p^ycho-educati'onal change in quasi'-experiment al and 
nonexperimental studies. As components of the research program, six 
projects were undert^aken that analyzed the bias in observed-score 
regression e^stiiaators an^ evaluated the performance of statistical models 
yhich estimate the parameters of the tr|ie-score regressions by using 
•information about^the variance st^uctur^ of the errors of measurement. 
Each project .constitutes a separate chapiter in the final report. The 
statistical, re suits developed in each chapter are general in that they 
apply with equal validity to all^linear models analyses, especial?.y 
multiple regression/correlation (MRC) and analysis of covariance 
(ANCOVA),. However, the results and theiir implications are d^cusjaed 
primarily with respect 'to studies of psychological and educational 
growth. The usefulness of the findings- for Educational researchers is 
described in considerable detail. Each chapter offers specific 
recommendations concerning ways in which researchers can guard against 
making errors of inference about' the determinants of change because of 
errors of measurement. We believe that the results of thia research 
program, if utilized by investigators, can substantially improve the 
quality of studies of educatic^al chang-e. 

In the first project (Chapter II) we present a c6mprehensive ' 
mathematical and statistical analysis of the problems caused by errors of 



measurement ' in Uinear models for assessing change. Results froja several 
disciplines are integrated; and their implications for studies of 
educratit)nal change are>^discussed. The second ^"^project (Chapter V)^ 
provides a general matrix representation of the problem and proves 
several new analytic result s concerning the parameters which affect bias 
iii observed- score regression statistics. We derive equations which ^ 
express the bias in ^ordinary least squares (OLS) estimators as a function 
of the covariantes todng the true scores, covariances among the errors of 
measurement, and' sample size. The^bjective of the thl^rd ptTjject 



(Qi^ter V'l) was 1:o devise^n algorithm for assessing the potential bias' 
resulting' from the unreliabil^Lty of measures.' The algorithm, which has 



^iJeen imp 1 rented in the torm of; a FORTRAN program, can be used by 
researchers' to improve the design af research projects and program , 
evaluations in order to minimize the likelihood of potential errors of ^ 
inferences about the' determinant s of change. As part of the fourth 

r 

project (Chapter III) we oindertook a comprehensive review of the 
st^4itistical methods which have been developed in several disciplines to 
estimate the parameters of true change by correcting the observed-score 
regression estimates for unreliability. The methods are formulated in a 
common algebra and ""Evaluated in terms of bias and power. The fiffh 
project (Cha^^er VII^ cons.-lsted of a series of Monte CarN|Lo toperimehts 
which were designed to evaluate the performance of several methods that 
utilize a priori information about the variance structure of the errors 
of measurement to estimate the parameters of the true-score regressions. 
The advantage's a'nd general superiority of estimators proposed by Fuller 
and his colleagues are discussed. In the sixth project (Chapter IV) a 



special type of model — the linear functional relation (LFR) — is 
introduced and discussed in terms of its relevance for the 'study of 
change. A'' variety of models which have been developed in psychometric s . 
and econometrlics for estimating the parameters of LFRs are compared and 
recommendations about the best methods are made. In the following 
sections, a summary df each project (chapter) is given. 

CHAPTER II 

The purpose of this chapter is to demonstrate the bias caused of 
errors of measurement in linear statistical models for analyzing change 
and to alert educational researchers to the potential errors of inference 
concerning the determinants of true change which can result from using 
unreliable measures in multiple regression/correlation and analysis of 
covariance. We provide a mathematical statistical analysis of the 
effects of measurement error on OLS estimators. The general situation 
considered involves pretest and posttest measurements on some attribute 
that is expected* to change as a function of intervening experience (e.g., 
t.reatment) and backgtound characteristics. A general linear model, which 
has been proposed for studying change by several authors, is described. 
Definitions of parameters of change and procedures for test^ing hypotheses 
a^ut the effects of treatment and background variables are also 
presented. Then a simple test^ score model xi^idh takes the observed 
(manifest) score as a linear function of true and random error (latent) 

■s • . j ■ 

variables is introduced. Next we rewrite the m^thematiclal model of ^ 
change to incorporate this measurement model, tliereby exj^licitly - ^ 

recognizing the fact that the variables are not perfectly reliable. 



^-3 13' 



The estimators and tests based on.tfie obseryed-score distributions 
are then evaluated in t.erms of how adequately they estimate the 
parameters of th'e true-score regressions or test hypotheses about effects 
on true change. We prove that the observed-score regression, estimators 
are biased and inconsistent ' for the structural parameters, the mag^nitude 
and direction of biaa being a complex function ^of the intercorrelations i 
and reliabillities of the 'variables . It is noted that measurement can 
exert hargif ul effects not only on estitaator^ of the regression \^ 
coefficients but also on the squared multiple cbrrelation and mean square 
error. Several examples are given of how large the bias and Kow 
incorrect the resulting inferences about the determinants of change can 
b'^. FollCwlng the proofs and demonstratioms, the diff erences between the 
interpretations and uses, of the structural (true-score) and 
observed-sbsore regression weights are discussed. We then introduce the 
concept of the ident if lability and show how it is essential to 
determining the estimability ^^^he structural parameters. In the final 
section of the chapter, the known conditions for the linearity of the 
observed-score relation when the structural relation is linear are 
delineated. The chapter provides a great deal of evidence that 
researchers should be very cautious when intei;rpreting MRC and ANOVA 
results based bn observed scores. In many research situations the 
observer-score^ estimates will be so biased that highly inaccurate 
inferences concerning the effects of treatment and background 
characttferistiCs^n true change will be drawn. ' 



CHAPTER III 

In this chapter a variety of single-equation statistical methods that 

^have bee^ developed in education, psychology, socioilogy, and econometrics 

for estimating the 'Structural ^jarameters are reviewed,^ Our objectfve^is 

to draw together the techniques from diverse sources, to express them in 

a common algebra that is synchronous with equations of Chapter II, and to 

analytically evaluate tfi^m wit ti respect to the statistical criteria of 

bias, power, and robustness. It is laoped that more investigators will be 

prompted to'use one of the^ methods as a consequence of this review. The 

results ^re intended to serve as guides for edxicational researchers who 

wish to use .one of the methods but do not know how. to evaluate them. 

In the first section we consider the original attenuation correction 

> 

formulas of Spearman and several more rece'ht geher^alizations of the 
method to semipartial and partial correlations. Although equations for 
the corrected estimators are simple and straightforward, finite sampling 
theory for th^^^zeTo-order and partial correlations corrected for 
attenuation has proven intractable. The methods of Porter (1967),- Stroud 
(1972), and DeGracie and -Fuller (1972) can bemused in situations 
appropriate for one-way analysis of covariance. Of these. Porter's and 
DeGracie and Full.er"s procedures have. the mox^ general applicability. 

. . ' ■ iT * ' - ' • • 

The exactness of Stroud's method, however, strongly commends it' for the 

b ' ' . r ' ^ ^ . 

two-group design. -Although the DeGracie and Fuller procedure appears 

* "■ . ' 

less powerful than Porter's, this^ disadvantage may be more than offset by 
the reduced bias and the safeguards of the procedure .which protect 
against the .correction for attenuation producing "impossible" slope 
estimates. ^ 



For t'fie'more general kinds of situations in which MRC. and factorial 
ANCOVA would be appropriate, xesearchers may select one of .the 
Stouf f er-Lindley or Fuller methods. It seems clear that for data which 
can form to. the usual assumptions of normality, homoscedasticity , etc., 
the statistical estimation and te9ti|!g procedures d^-C^eloped by Fullet*' 
(1980), Full-er and Hidiroglou (1978), and Warren, White, and Fuller 
(1974) will prove superior to the Stouffer (1936) and Lindley (1947) 
methods. Fuller's methods preclude esti^mation^of singular covariance 
matrices following corrections due to unreliability, yield significance 

tests whicjhL are valid for finite samples, and provide a mechanism for 

*■ 

incorporating information about the sampling distributions of the 
predictor reliabilities into, the standard errors of the est^imators of the 
true-score regression coefficients. Consequently, it appears that the 
St ouff er-Lindley estimators are more sensitive than those of .Fuller and 
I|ls^ associates. 

It is. concluded that the. existing methods provide several adequate 
.estimators of the true-score regression parameters. The major remaining, 
problems corkern sampling theory for the estimators of the structural 




ERLC 



^ parameters. /Tl]Le" validity of significance tests remains a significant 

question for all of the estimators except fuller's. This chapter 

^ I ^. ' " > * , . ^ 

clarifies and refines these issues, and "the simulation studies reported, j 

below add further insight.' It is pointed out that questions involving . 
the type pf reliability est^imate-^o use and testing the assumption of 
homogfenity of true^core regressions constitute important problems for 
future research. 

Several examples of the application of the correction me.thods 
illustrated the kinds ok^ error of ^ inference that could have resulted 



Ij6 16 



V ■ • ' ." " ' ■ 

from errors of measurement in previous Investigations of educational 
change. It Is hoped that, the explication and evaluation of the 
attenuaitlon-^rrectlon to^fhods provided ^In this chapter will encourage 
and facilitate their use In future studies. 7 

CHAPTER IV 

The purpose of this chapter Is to ^analyze the problem of determining 
If a perfect .linear relation exists among two or more variables and td 
review some statistical methods that have been developed to estimate and^ 
test lliiear functional relations. By definition, a linear functional 
relation ;(LFR) exists if the true scores on two (or more) lueasures are 
perfectly correlated. Although most of the statistical work on LFR has 
been done by econometricians, a problem haa been Investigated in the 
field of psychometrics which is formally identical to LFR. 

# Psychometric fans have developed several statistical tests of the 
hypothesU.s that two scales measure the same attribute except for 
differences in means, units of measurement, and standard errors of 

'measurement (or reliabilities). When scales satisfy these conditions 
they are said to be equivalent 'or ccfhgenerlc. As is demonstrated in the 
chapter, equivalent tests are related by a linear functional relation. 
Xhe correlation between equivalent measures, l.e.^, between two variables 
that have a linear functidnar relation, , when corrected for attenuatlo^ 
(unreliability) la 1.0. In this chapter the diverse theory and methods 
from econometrics, statistics, education, and psychomettics are 
collected, compared, and Integrated. Several new results are derived for 
the error s-ln-varl able s problem which should prove helpful in analyzing 



change occuring In*- measures which »£/ontain errors of measurement. Several 
ways in which LFR models can be, applied in studies of change are 
discussed and illustrated. 4 ^ 

The chapter explicates and compares seven statistical methods 
designed to determine if the tfue scores from two ^or moire tests are 
perfectly linearly related. They fall into one of three sets depending 
upon th^ typ^of information or data required by the procedure. The 
first group contains three methods which require replicate measures of. 
each scale, vjz . , Jbreskog (1971) , Kristof (1973), and Lord (1973). In 



the second set '^re three methods which assume information is available 
• about the covariance structure of the errors of measurement. While such 
information may be obtained froD\ replicated data, it can come from any 
other independent sources. These methods, which were formulated 
primarily by statisticians concerned with estimating and testing linear 
functional relations, include the methods of Koopmans (1937) and Tintner 
(1945, 1.946), Fuller (1980), and J8reskog (1971). The third set of 
methods includes only Fuller and Hidiroglou^s (1978) procedure for 
testing [matrix singularity when independent information about the 
reliabilities of the variables is available. The method uses the 
reliabilities to adjust the covariance matrix of observed scores in mych 
the same wky that the .estimates of measurement error variances are 
utilized by the procedures in the second group. Indeed, all sev^n 
procedures are very similar in lo^ic, if not in mathematical detail: 
each uses information a,bout the covariance structure of the observed 
B^asures and errors of measurement (from, replicate measurements, error 
varjbance estimates, or reliability estimates) to estimate the parametets 
of theXltnear functional relation. ^ ' ^ 



The choice of optimal method for estimating and testing a linear 

■• 

functional relation depends ^to a great degree on the complexity of the 

' •« . ^ . 

....... J 

hypothesized modfel and typ§i>.of data available. When there are replicate 
measures for each variable, ^ny one of the sev^^ procedures can be used. 

f c \ 

With simple models, Kristof 's» Fuller's, or Fuller an^ Hidiroglou's 



methods should prove generally^ superior to the others. \ Jbreskog * s COFAMM 
and LISREL models will be preferred for more complex models where the 
sample size is large. The methods of Kristof and Fuller can be expected 
to be more robusti^ to assumption violation, especially to nonnbrmality . 
When only" estimates of measurement error variances and reliabilities are 
available, the relative advantages and superior performance of Fuller's 

I 

methods should^ lead to their selection. The validity of the significance 
tests for f initie^^amples strongly commends his procedures. Although . 
Fuller's pr9cedures may be less sensitive in jcertain applications, their 
ease of computajtion an^k^the ayailabiAity of a computer progjram for mak'ing 
the computations commend them. ^ t . ' 

o 

CHAPTER V 

In this chapter we derive a general matrix representation which 
exprasses the parameters of the observed-score regression as functions of 

" ' ■ \ 

thiB covariances among the true and error components. \ Explicit 
expressions are derived for the bias in observed-scprei estimators of the 
mean square error, squared multiple correlation, and sampling 
distribution of the regression. coefficients. Thus, we are able to- 
evaluate the parameters affecting bias. The kindS'Of data and^qonditions 
are specified which are li'kely to lead to incorrect inferences concerning 



the determinants of true change based the results of observed-score 
regressions. The general equations expressing the bias in observed-score 
regression est-imators have not been presented previously and represent a 
significant contribution^^ this repea/ch. They enable educational 
researchers to det ermine ^ a priori the potential for misleading inferences 
in planned research. 

- The observed-score %stimatoj of the mean square error is always, 
positively biased, i.e., increased in magnitude relative to'the 
true-score parameter,^ by errors df measurements in the posttest, or 
criterion, 'variable. Thus, power is reduced ^nd the probability of Type^ 
II error is* increased by unreliability of the posttest. Although a 
general" statement can not be, made about the biasing effects of correlated 
'criterion and predictor measurement erroys,* with the kinds of data 
typically encountered in studies of educational change we can ^pect bias 
to increase as the correls^tions increase. As the variances of the 
predictor Errors of measurement grow in size, tV^bias in the mean square 

error also grows. This effect' becomes especially pronounced as the j 

I' - . - ■ 

measurement error variances approach the magnitude of the true' score 

variancW* Unequivocal statements about the degree of bias introduced by 

correlations among the predictor measurement errors can not be made. It' 

depends up.on the patterns of both the true score and measurement error 

intercorrelations. If it is assumed that all measurement errors^ iT\ the 

dependent and fndenllendent variables are mutually uncofi^eldt ed, it will be 

generally true that unre/liability increases the mean square error >nd 

decreases power. SrwJe the squared muirtiidre correlation is an' inverse 

function pf the variance of the regression residual, we know that the 



factors which positively bias the mean square'error negatively bias the ^ 

estimiitor of the true-score multiple correlation coefficient. Although 

the parameters effecting bias are seen to be highly complex,* the overall 

effect of preldictor measurement errors will be to^ increase the, bias in 

the estilBBte of "t^he coefficient of multiple determination when most- error 

covariances are positive and relatively small in size. Thus^ 

observed-score regression analyses of change on the average will 

underestimate the goodness of ^it of the true-scare model in most 

educational applications. Bias in the estijnatar of the sampling 

distribution of the vector of,»observed-score regression coefficients also 

depends upon the covariances among the true scores and aiiipng the errors 

of measurement. The same factors which affect the mean square error and 

squared multiple correlation have similar effects* on imators of the 

standard errors of the regression weights. The main determinants of the 

joint sampling distributions, however, are the patterns of the joint 

distributions of the true and error, components. General statements about 

the magnitude and direction of bias can not be made. Thus, the general ^ 

effects of bias on t-tests for the individual coefficients are difficult 

\o assess. The formulas presente4 in this chapter do enable researchers 

to-evaluate the potential for bias in any specific set of » circumstances,, 
■» * 

however. Therein lies their value. » - 

CHAPTER VI 

The purpose of this chapter- ip to develop a miethod for investigators 

<i ' . , ■ 

to easily assess the possible impact of measurement 6rror on statistical 

• • * ' 

* analyses of change. , Using .the results of the preceeding chapters, 
especially those of Chapter V, an 'algorithm is developed which talces as 



I-ll 



21 



input estimates of the parameter values of the structural relations^aiQong 
the latent variables (which the investigator thinks are close to tjjje' tpue 
d value a priori ) and outputs the expected values of the corresponding 
observed-score regression parameters for a prespecif ied ^sample., size. The 
logic of the algorithm is explained and -illustrated with a simple example 
of the effects of external, locus of control orientation on change In 
science achievenfent • 

As part of this research progxam, the algorithm was implemented in 
the form of a FORTRAN computer program^ which can be easily installed in " 
most software libraries. Input to the program consists of information 
about the*^ covariances ^ong the true j)redict ors, the reliabilities of the' 

• * . * 

Observed pre'dictors, and the true-score regression coefficients. The 
program output^s values of the true-score regression parameters and those 
of the c6rx;esponding observed-score regression parameters. Comparison of 
the two sets of parameter\ values allows one to assess the degree of bi^s 
likely to occur in observed-score regression coefficients as estimators 
of their true-score counterparts. In the final section of the chapter, a 
comprehensive , application of the 'computer program is presented. ^ 
Use of thet program will enable investigators to become aware of the 
^ays in which measurement^ error may bias regression analyses of change. 
Making this evaluation before data collection is completely analogous to 
carryi^ oiit a power analysis. The results of the assessment may lead 
. the investigator to. modify data collection- p^ans. For example, the 
program! may reveal that the reliability of the pretest must be increased 
if accurate inferences are to be possible. The assessment may indic^e 
that bias lean not be avoided easily and prompt the investigator to gather 



the 4ata in such a way as to make the use of attenuation-correction 
m^hods ar multiple indicator (LISREL) models possible. ^ Also, as with ^ 
power' analysis, the program can used post hoc to .determine the degree- 
tpr caution one Should have when interpreting the results of the 
regression analyses of observed scores. i»Jn many situations, like the one 
described in the example in this chapter, it will te concluded that the 
possible bias in the observed-score* regression estimators was so great 
that any inferences must be regarded as completely suspect. 

' m- . CHAPTER 'VII 
Chapter VII reports the results of the Monte Carlo experiments 
designed to evaluate the performance' of various multiple regression and 
analysis of covariance methods that correct for errors of measurement. 
The objective of this research was, to„ determine which, statistig^l 
procedures for estimatiog the structural parameters of cl^ange' 
demonstrated the least bias and most power. Only when th^Ls information 
is provided to educational researQhers can they choose an estimation 
technique that is opfivimal'-% or their purposes. The results of these 
simulations can' be utilized to reduce the chances for drawin^f f aulty 
concVu^ons about the effects of treatment? or individual differences on 
true chang^e analyses of observed scores. We were fortunate to be able to 
' derive'a number of analytic, results which obviated the need for some of 
the simulations that had been originally anticipated.- 

Tv^o simulation experiments were c6nducted. The first compared 
regression methods, and the second evaluated analysis of covariance 
procedures. Different 'simulations were required because the MRC and 
ANCOVA correction methods required different information. Specifically, 



\ 

individual scores were needs for the ANCOVA simulations, while o,nly 
covariance matrices were required for the regression studies. 

In the first*' experiment the performance of the Stouf f er-Lindley 
method was compared with that of the Warren, White, and Fuller (1974) 
procedure. Both of these were contrasted with traditional OLS regression 
analysis of the observed scores. Covariance matrices of true, error, 'and^ 
observed scores were defined according to the equations given In the 
preceding dhapters and randomly generated using IMSL subroutines. The 
effects of six factors on the relative performances of the OLS, 
Stouf f er-Lindley , and Warren, White, and Fuller methods were 
systematically assessed: the effect of the pretest, the pretestr-research 
factor correlation, the pretest reliability, the post test reliability, 
the sample size, the effect of. the research factor on true change, the 
sampling variance ojf the pretest reliability coefficient. The 
performance of the methods was measured for many statistics, including 
the mean square error of the model, pretext regression coefficient, and 
the standard error of the pretest regression weight. Main interest, 
however, concerned the Bias and sensitivity of the regression estimator 
fox the effect of the research factoi; on change. Bias, power, rfnd 
probability of Type I error for the three methods were evaluated as 
relative* and absolute criteria. The results /indicated that both the* 
Stouf f er-Lindley and Warren, White, and Fuller methods performed 
adequat;ely across the conditions simulated^ The degree of bias in both 
null and nonnull conditions was small, usually less than 10%. The 
direction of the bias, however, was uiit)redic tables In genefal, bias 
increased as the eff-ect of the pretest (or, pretest-posttest correlation) 
and the pretest-research factor correlation increased. Bias decreased as 



the reliability of the .research factor grew. Empirical alpha va,lue$ did 
not differ sub3t antflaj^ly^f rom nominal levels. Power was enhanced by 
pre-post correlation, reliability of both pretest and research factor 
scores, and measurement. As expected, priest-research factor 
.orrelat ion adversely affected power. The Warren, White, and Fuller 



method was superior to the Stouffer-Lindley procedure, when the sampling 
variability of the estimators of the pretest measurement error * variance 



(or^eliability ) was recognized. The former method explicitly | / 

incorporates information concerning the variability of reliability 

/ 

estimators. As the sample size upon jvhicli the ^reliability est^Lmate is 

^ ' ' J' - ^ 

based becomes very large, the -two methods produce virtually identical 
results. With small sample sizes, however,- the Stouf f er-^Lindley 
estimators can perf very poorly under certain sets of conditions. We 
conclude that the method of Warren, tihite, and Fuller can be recommended 
as the multiple regression method of choice for studies of ediicational 
change. 

^Results from the second series of Monte Carlo studies on ANCOVA 

methods produced results that closely parallel those obtained for 

• \ 

regression methods. The feffects of seveijal factors, e.g.-., 
pretest-posttest correlation, on the bia^ and power of estiiaators-^Qf 



covariate-ad justed means were assessed in the two-group ANCQVA design. 
The DeGracie and Full^ method demonstrated superior performance to the 
Cohen and Cohen and Porter methods when there was variation in 
reliability estimate and sample sizes were small. The differences in 
performance ^ong the methods diminished as sample size increased. The 
use of the DeGracie and Fuller ANCOVA method for estimating the effects 



of treatment groups on true change is advocated for the kinds of 
situations generally found in educational research. The availability of 
a computer program for performing regression and covaria'^ce analysis by 
the duller methods greatly facilitates the application of these 



true-score estimation procedures ^n stufeil^.s of educational growth. 



1-16 26 



0 



CHAPTER II 

PROBLEMS CAUSED BY MEASUREMENT ERROR IN ANALYZING CHANGE 



r 

INTRODUCTION 



The objective of this chapter^is to provide a mathematical analysis 
of the effects of measurement error on statistical models for analyzing 
change* The general situation that we consider involves pretest and 
posttest measurements on some attribute that is expected to change as a 
^^nction of intervening experience (e.g., treatment) and background 
characteristics. A general linear model which has been proposed by 
several authors for studying change is presented, and definitions of 
parameters of change and procedures for testing hypotheses 'about change 
as a function of treatment and background characteristics are developed. 
Then a simple test score model which takes the observed score as a linear 
function of an unobserved true (or latent) variable and a random error 
component, is introduced. The mathematical .model of change is then 
rewritten to incorporate this measurement model, thus explicitly 
recognizing the fact that the variables are not perfectly reliable. 

The estimators and tests based on the observed-score distributions 
are then evaluated in terms of how adequately they estimate the 
parameters of the true-score distribution or test hypotheses about true 
change. Briefly, it is proved that the observed-score^estimators are 
biased for the structural parameters, the magnitude and direction of bias 
being a complex function of the .interc^rrei^^ions and reliabilities of ^ 
the variables. Next covariance structure analysis is used to explicate 
tfie relationship between the observed-score and latent-variable 
parameters*. This^enables us to determine values of the observed-score 



parameters when given the corresponding' values of. the true-score 
parameters and reliability information. 

It should be pointed out that in this chapter we do not treat* those 
situations where assignment to treatment has been made on the basis of an 
unreliable pretest. Following the pioneering work of Goldberger (1972), 
several statisticians (Kenny, 1975; Overall & Woodward, 1976a; Rubin, 
1977; Weisberg, 1979) have demonstrated that unbiased estimators of the 
differences between the. treatment and control groups in true change can 
be obtained when this kind of selection process is employed. Furthermore, 
the effects of measurement error when group regressions have heterogeneous 
slopes. lies beyond the scope of this chapter. The reader is referred to 
Rogosa (1977b) for an excellent, comprehensiye treatment of the effects 
of measurement error on the Johnson-Neyman -technique. 

\ 

GENERAL STRUCTURAL MODEL OF CHANGE 
Before discussing the bias caused by errors of measurement the 
general framework and .model^ f or studying change must be developed since 
this has been an issue of some contfoversy (cf. Cronbach & Furby, 1970; 
Keesling & Wiley, 1977; Linn,& Slinde, 1977; Werts & Linn, 1970; Wiley & 
Uarnischfeger, 1973). Following Wiley and Harnischf eger (1973), we wish 
to consider a structural model of the effects of initial status 
(pretest), ^^ea^ent program, and background characteristics on final 
status (posttest) with respect to som^ quantitative attrib'ite. 

The model diagrammed in Figure 1 shows the possible causal relations 
(assuming the model is correctly specif ied) among these four factors. 



Insert Figure 1 about here 



II - 2 



28 



\ 



since the model is represented along a time dimension, the background 
variables, e.g, parents' 'SES, sex, or aptitude, are depicted as most 
remote from the posttest. These background characteristics can affect 
either Initial status (solid arrow) or the probability of assignment to • 
treatment programs when subjects" are riot randomly' assigned (broken ^ 
arrow) . Initial status can Influence final status (solid arrow) and 
treatment groups' membership when assignment to groUps is based on ' 

pretest scores (broken arrow). 

» * ■ ■ . 

Lettlpg represent initial status, X„ treatments, X 

* 3 
background characteristics, and Y final status, the model may be written 

in symbolic form as: — ■>v-___ • • 

Y - bo + biXi + b2X2 + b3X3 + e ^ (2) 

where b^ are structui^al coefficients characterizing the multivariate 
distribution of the Y and X^, and e is a stochastic term symbolizing 
sampling or specification error. In most applications the bj are 
partial regression weights indicating the contribution of X^ to Y and 
could be subscripted as byx^.x^.- This notation more clearly 

demonstrates that referenL is to th^ effect of X, on Y while 
controlling the effects of the other Xy (where j' j4 j). " 

If we define change or growth as the difference between* final and 
initial, statuses, simple algebraic manlpulatlbn of Equation 1 allows us 
to show the relationship of this model to one that takes change as the 
dependent yarlable. 'Subtracting from both sides of Equation 1,'we 
obtain: - ■ \ 

C " Y~Xl - bo + (bi-l)Xi + b2X2 + b3X3,+ 9 (2) 



where C designates change. Thus, and* are the sam^ in botti 
equations while the weight for the pretest in Equation 2 is simply one 
unit less than the comparable* coefficient in Equation 1.. With this 
approach to defining change, clearly there is no need to deal with aci 
change scores as Werts and Linn (1970) and Wiley and Harnischf egar (1973) 
have pointed out. The coefficients specified in Equation 1 can be 
interpreted as parameters of change. 

The growth model represented in- Equations 1 and 2 is <ief ined in terms 
, of true scores or latent variables. That is, the equations specify the 
structural model in the ^me terms as does the theory^ i«e., s^s relations 
among, hypotl^etical constructs (cf. Cronbach & Meehl, 1955). In this 
context, the bj assume considerable importance as imrameters or the 
hypothetical mechanism which generates the observed data. Thus, > the b^ , 
indicate the strengths of particular connections among theoretical / 
constructs and, collectively, define a behavioral or psychological law. ^ 
In almost all research on change .the investigator seeks Information about 
•the form of the structural model defining change in stakua for some 
behavioral domain and estimates the magnitudes of the b^.* If the 
constructs or latent variables could be pirecisely measured, it would be a 

V • ^ 

relatively simple procedure to estimate the bj from the measurements 
and to evaluate the adequacy of the model in accounting for the observed 
dataji Unfortunately, in the social sciences our capability to measure 
theoretical cons^ucts without error is limited so that^ estimating and 
testing structural models becomes highly problematic. 



To describe the mechanism which more adequa^tely reflects our beliefs 

- • • 

about how, the observations came into being, we- construct the model 
^depicted in ^ 



Insert Figure 2 about here 



Figure !• In this 'schematic, recognition is given the fac^ that one's 

measurements are ^fallible, i.e., contain errors of measurement . Now a 

' 1 
system of structural -equations is required to specify the model. We 

f 

add to Equation 1 the following: 
XI « fio + fllXl + ^1 

X2 - f20 + ^21^2 +(^2 . • 

X3 =- f3Q + £31X3 + U3 \5) 

y- - fyO + fylY + V (6) 
^where Xj and y are observed or measured values, the Xj and Y are true 
scores or latent variables, the u , and v are errors of measurement, and 
the f coefficients are parameters specifyJ^ng the regression of the 
observed scores on the underlying factors. The f^^ are, in^fact, 
factor ^Loadings. The reader may recognize this as an application of 
' Jl5reskog*s (1971) theory of congeneric tests. Using vecj:or and matrix 
notation Equatiojis 1. through 6 may be written compactly as ^ 

- b tjij, + ek . . (7) 

2Lk- iOx+ I.xX + u k - , 

yk - iOy+IyY + V k _ (9) 

where underscoring is used to designate vector- (lower case) and matri\]c (uppe 
case) quantities^ and F_ is a diagonal matrix with the factor loadings ^on 
the principal diagnoal. Together Equations 7i 8 and 9 constitute a Linear 



Structural RELa ltilons (LISREL) model as defined by J»reskog (1972, 1973). 
Equatioixs 8 and 9. specify what is termed the measurement model, while Equation 
7 represents the structural or causal relation* Although social researchers 
state their hypotheses ' in terms of Eqtiations 7-9 and would like to estimate 
the values of bj and f^^ contained therein, most are forced to perform a 
regression analysis of the observed scores (treating them as ;Lf they were the 
true scores). This situation is depicted ±ti Figure 3, which shows that the 
X have been -'substituted for the X^. The regression parameters giving the 



Insert Figure 3 aboiit here 



expectation of y for fixed x^ are designated with primes, b'j, to 

indicate their correspondence to the respective structural parameters, 

b • Patently, the b* will equal the corresponding b only under a 
j j J' 

very limited set of conditions. Using estimators of the b'j as - 
estimators of the bj is unsatisfactory in most applications, because 
the b'j are neitl^er unbiased nor consistent estimators of the bj. 

Therefore, inferences about the nature of change and its determinants can 
l)e Inaccurate or misleading if based on the regression estimators, b*j . 
In the next secjtion we demonstrate the bias and inconsistency of the 
Qbsexrved-score estimators and describe the potentially deleterious 
effects of measurement error on inferences about change. 

r 

PROOF OF BIAS AND INCONSISTENCY ^ 
In this section we consider the consequences of using the bS as 



II - 6 



estimators of the ^yuctural coefficients. For ease of exposition, the 
case bf one X variable is taken up first anc^ then the two-predictor 
case. Pollening proofs for these cases, the proof of bias and 
inconsistency for the general multiple variable situation is derived. 

The Single-Predictor Case 
To prove that the £rdinary Least Squares (OLS) regression analysis of. 
tlie observed scores produces biased and inconsistent es^timators of the 
structural parameters when there is a p6sttest and a single pretest, we 
begin by writing: 

■ b'o + b'lxi + e' (10) 

and letting f^^^ - 0 and f^^ - 1.0 in Equation 3 and f^^ - 0 and f^^ - l.O' 

in Equation 4 - 

xi - X 1 + ui , . (11") 

y ^ Y + V . (^2) 



* (xik - xi> (Yk - y) - Syxi , (13) 

and b'o - y " bixi . (1^) 

The summation is taken over k - 1 to N units of obser-^fation. Substituting 
Equations 11 ,and' 12 into Equation 13, we find 

C. „ TKY + V) - (Y +'v)] [(Xi + ui) " (X + Gi)] 

"1 _ ^ 

• "Et(Xi +' ui) - (Xi + 

.2:(Y-Y)(Xi-Xi) +2(-Xi- Xi)(v-y) +ZCY-Y)(ui-ui) +2(v-v)(ui-ui) (15) 
XCXi-Xi)^ + 22(Xi-Xi)(ui-ui) +Z(ui-ui)2 

J)n the assumption that E(Xj^u^) - E(X^v) - E(Yu^) 0, the J 

second and third terms in the numerator and tlie second term in the 

"# . • 



denominator approach zero as 'the, sample increases wlthbut limit. Thus, 

) 

the probability limit for b'l is given as 

SYXi + Svui \^ ^ (16) 



plim b'x 



Making the additional assumption that errors of measurement in y and 
•are ancorrelated and noting that. by definition we divide both the ■ 

numerator and denominator of Equation 16 by^sx^ to obtain 

SYX . (17) 



bl- 2 



plim b'l " '■ — ■■ 2" 

Sui 

1 + 

8X1^ 



t>l . (18) 



The probability limit of b'l does not equal bi but underestimates it. 
Thus, the OLS estimator of the slope of the regression of the posttest on . 
pretest is an inconsistent estimator of the structural parameter. By 
similar steps and noting that E(e - biui)(xi - xi) does not equal zero, it 
f/)llows that b'l is a biased estimator of bi,? From psychometric theory 
(Lord & Novick, 1?68) the population reliability of the observed variable 



X- can be written as ^ . ' 

• * «» 

^ 8x^2 °Xi^ . (19) 

'^ll " 5 T~ " 2 

Using this identity Equation 18 can be rearranged into a more familiar form: 

« 2 

bl _ _!fL_bi - riibi . (20) 

p^^" ^'"1" s^,2 + i^n^ ' 8xf_l!::L_^ 



II - 8 



Alternatively, . / . 

bi - ^'^ . (21) 

For additional details concerning this proof the interested reader is 
.referred to Bohmstedt/^1969j pp. 122-125), Cochran (1968, pp. 651-652), 
Johnston (1963, PP- l.^?:!:^^!)) , and Schmidt (1976, pp. 105-115). v 

Since r^^^^ ^.0, b'^^ The relationship (structural 

coefficient), between the latent variables is always greater on average 
than which would be inferre'd from the OLS regression of observed scores 
when these observations are fallible. Thus, et^ors of measurement 
•^attenuate"! the regression of Y on^Xj^', that is, they bias the estimate 
of the slope toward z.ero. Note that it is the errors in x and not those 
in y which c^use the bi'as as long as E(vUj^) ■ 0. Since a value of 1.0 
for bi v^ould indicate no expected change from Time 1 to Time 2'. 
measurements, the attenuated estimator b'l will lead to- the faulty 
inference that persons above and below the pretest mean (X^ « x^) will 
show more absolut^e change than is actually th^ case. Of course, this is 
the well-known regression to the^mean phenomenon caused, by errors of 
measurement (Campbell & Stariley, 1963).' The point is that inferences 
about true change and about true change as a function of true initial 
status (Thomson, 1924; Werts & Hilton, 1977) will be inaccurate because 
of the unreliability of the pretest measurements. 
^ . The Two-Predictor Case * - 

It is now our purpose to denfonstrate the bias and inconsistency in 

the b'j two predictor case- where X^ f^a pretest and X2 represents another 

/ 
/ 

/ 



determinant of change, either a treatment or background variable. ^If 

this second vafi*able is classificatqry , i«e«, represents membership in, a 

treatment or sociodemographic group, then becomes a coded variable, 

and the b^ is a function of the mean differences between groups. ^Thus, 

analysis of covariance lean be i;:epresented as a standard multiple 

regression problem (Cohen & Cohen, 1975; Overall & Klett, 1972). Our 

concern Is with the structural model specifying change^as a f/unction of 

initial status and treatments or background characteristics./ This is 

most easily dealt with by expressing Y as a function oi X. /and X^: 

f . 2 , ' 

Y - bp + b^X^ + b X + e . \ ^ (22) 

We begin with the standard definitions of the strxictural parameters as ^ 

V ■ ■ / 

given in any advanced text on linear models: / 

, SX2^ BYXi- 3x,xa aYX2 , (23) 

2 2 9 

8X1^ 8X2^ - 8X3^X2 



b2 - 



8X1^ 8YX2 " ®XiX2 '^YXi 
SXi^ 8X2^ 8X1X2^ 



(24) 



and bo - Y - bjXi - b2X2 . . ' ' (25) 

Adding^he stipulations that ^X2 « X2 + U2 a^d'that E(uiU2)'=» E(u2v) =» 
E(X2U2) » E(Y2) ■ E(Yi) * 0 to the bivariate regression model considered 

above, we can derive the expressions for the expected, valuers of b' as 
functions 'of the bj« The formulas for the observed- score estimators 
analogous to Equations 23-25 are 



ST 



bf 



A r| A A A 



^'n - y - b'i5ci - Isx 



and ^ " y " '''l^l " *'2*2 

■ »■ .- . 

Using the psychometric identity that 



(27) 



(28). 



s under the stated 

yxj 



assumptions (Lord & Novick, 1968) and substituting expressions for the 
true scores into Equa»tions 26 and 27 the/'f ollowing results are obtained: 



and 



b2 



(8X2^ + Su3^2) » 8x^X2 8YX2 

(sXi^ + Bui^\ (8X2^ + 8u22) - 8 X1X2' 
(sXi^ + Sui^) 8YX1 - 8X1X2 8YX1 



(29) 



(30) 



(8X1^ + 8u^^) (8X2^ + Su2^) - 8X1X2' 



•» 2 2^ 

Clearly the b'j do not approach the bj in .the limit unless Suj^ - s^^ = 0. 

Thus", the .b'j are not consistent estimators for«.the structural 
coefficients. As with the bivariate case, the bias of the b'^ follows" 
from the fact that the expected value of the ^ovariance of the residuals 

f ^ 

from regression and the true X_j values does not equal^ zfero. Equations 
29 and 30 reveal how potential^.^ misleading the observed-score regression 
weights can be as estimators of the structural coefficients. The value 

of b' can be greater or less than b,' depending upon the magnitude of 

1 ^ 1 < 

Sy 2. A similar r^sult:holds for b*2 b2. While Xn the 
bivariate regression -case the slope estimator is attenuated by errors of 
measurement on the average, in the multiple regression case the value of 

^ r 

h\ can 'be either attenuated or -accetituated" (Wiley & Hornik, 1972) by 
measuremient or observational errors. 



II - 11 



, 37 



1 



The fact that b* . can be statiltically significant with a negative 
sign while the structutal parameter has a relatively large positive value 
has lead many statisticians and psychometricians to recommend strongly 
against the use of b' as estimators of the bj, e.g.j Cochtan (1968), 
Cohen & Cohen (1975), Cronbach -(1976) , Hummel-Rossi & Weinberg (1975), 
and Lord (1958). The bias in the observed-scor.e regression weights 
attributable to measurement error certainly poses-^a grd^ve problem for 
longitudinal research. Ic can produce inferences that the effect of 
Treatment 1 relative. to Treatment 2 was harmful when in tr^th it was 
beneficial. Or, it may lead to conclusions like high SES children 
• changed Diore than, low SES children when, in" fact, relative change was in 
the opposite direc^on. This would seem an intolerable state of , 
affairs. Moreover, the biases in the regr^ssipn weights are not the only 
parameters of change affected by measurement error. These will fee 
described After the expression of the bias in the b'j has been derived 
for the general case of J predictors. 



The General Case y 



Let S arid S be the v^lSnce-covariance matrices the tjrue 

and observed scores, S the covariance matrix of the errors of 

' — uu 

measurement in the x variables, s^^y " ^yx vectors of 

predictor-criterion covariances and b^ and b^" the vectors of regression 

weights for the true and fallible variables, respectively. Then it 
^follows that 

' - ' b S XX - £ YX , ; (31) 

^' i nc- £ yx YX - » (32) 



II - 12 



b' (S XX + i uu) - £ YX . . ". (33) 

and Jb S XX - XX + i uu) ». (34) 

where the vectors Jb and Jb* are t^ransposed to make the quantities conformable 
for multiplication. Postmultip lying both Sides of Equation 34 by S^""^ * 
we find that Jb can be written as a , weighted function of Jb* (cf . Lindley, 1947 
pp. 227, Eq. 40): 

. . b - b* S ^ S xx"^ / . (35) 

This matrix product indicates that unless the are uncorrelated, «^he 
bias in b* J as an estimator of bj depends not only on the er^rs in 
Xj but also on all intercorrelations Xjj*(j'Y J*)«^ Thus, even when 
Xj is measure^ perfectly so that Xj « Xj, b*j still will be a 
biased estimator of b^. 

Errors of m^surement, therefore, should exe^^^their most damaging 
effects in survey ^and quasi-experimental studies^ where the lack of 
experimental control will result in, substantial^ intercorrelations among 
the pretest and factors associated with change.. In these kinds of 

studies of change the weight associated with the pretest "undercontrols" 

ji 

(Wiley & Hornik, 1972) or insufficiently adjusts for differences among 
individuals in initial status. In ANCOVA the regression correction for 
covariate (pretest) differences between groups may be too little or too 
great, and the resulting comparison of differences in adjusted posttest 
means will be biased if the groups differed initially (Snedecor & 



Cochran, l96Z).k In predictor sets where som^ of the variables* are more 
reliable than others,' part of the contribution of the less reliable 
predictor will be attributed to the more reliable predictors. However, 
other factors associated with quasi, and nonexperimental studies ma^^ more 



than offset the loiadequate adjustment due to measurement error (Cronbach 
et al., 1977; Heckman, 1979; Olejnlk & Poeter, 1981; Weisberg, 197.9). 
Under some circumstances, no adjustment at all for the pretest reduces 
the bias in bj« \ > . ^ 



EFFECTS\OF MEASUREMENT TERROR ON ^OTHER ESTIMATES 

In' addition to the problems with the regreasiori weights there are 

other potential pitfalls to interpreting "the observedTSC9re regression 

results as providing verdical information about the hypothesized 

structural model. Some of these will be briefly summarized. First, ^ 

errors of measu^iement will make the overall adequacy of the model- appear 

less than if the variables were' perfectly reliable. Both indices of the 

goodness of fit of the data to the model, the coefficient of multiple 

correlation or determination GR ), and the mean square error (MSE), or 

2 

residual variance, will be biased by observational errors. R will be 
attenuated and MSE inflated son the average (Bohrnstedt, 1969; Cochran, 

1968, 1970). When the are uncorrelated, Cochran (1970) has shown 

2 

that the degree of attenuation in R is a function of the reliabilities 
of y and the k^: - 

.R'2 « r2 r^ (36) 

where r^ is a weighted average of the rjj. It is apparent from the formula 

for the mean square error or the residual variance from regression, 

8.2 - 8,2 (1-R.2) t37) 

that errors of tmeasurement will have a proportionally greater effect as 

■ 2 

R increaaies. Although he was unable to derive a closed form 

expression for R* when th^ are correlated, Cochran (1970) 

1 J 



II - 14 . V 

4U 



suggest^ that Equation 36 "may serve -aa a rough guide to the effect of 

0 

'Errors ofyiileasurenient on the squared multiple correlation In many 
applications". The value of R* may be up to 10 percent higher (than 
Equation 36) If most correlations are p'bsltlve and harmful and the r^ ^ • 
exceed .7, and up to ai5%'hlgher If the r^^-are as low as .5^" The 
decrease In explained variation means thajt the power of statistical tests 
will be lowered^. While the errors in y do not contribute to the bias in 
the b*j as long as v is uncorrelated with Y, Xj and \Xy they do 
contribute to the reduction in R as indicated in Equation 36 and thus 
also to loss of power (cf . Bohrnstedt, 1969; Cleary & Linn, 1969; 
Cochran, 1968, 1970; Nic'ewander & Price^l977;. Sutcliffe, 1958; Walker & 
Lev, 1953; Winne, 1977). 

Although the raw partial regression weights are not biased by errors 
in y, the standardized partial regression weights and path coefficients 
are attenuated by v. Additionally, both prediction and simulation will be 
affected By errors in y. ^ • . 

The final" consequence of_6rxors in measurement in analyses of change 
concerns the distortions they cause in analysis of covariance. As 
pointed out previously, errprs of measurement in the pretest will bias 
the estimates of adjusted posttftst differences if the groups differ in . 
mean pretest scores "(Campfeell & Erlebacher, 1970; Dunivant 1975, 1977; 
Kenny, 1975; Overall & Woodward, 1976a, b; Rubin, 1977; Werts & Linn, 
1971). ANCOVA is predicated on the assumption of homogeneous pooled 
wlthln-groups regressions of;covariate on criterion. Whenever the slopes 
are heterogenous, or equivalent ly, there is a covariatebly-research 
factor interaction, ANCOVA is no longer appropriate. A mathematical 
model which evaluates the differences in regression lines must be 

, ■■ ■ . ' . ■ . ^ 

' : . 41.: • 




adopted* This may take the form of the Johnson-Neyman technique or j 
product-^vectbrs In multiple regression* , 

Recently, Rogosa (1977a, b) has demonstrated that the loss of power^ 
due to errors of measurement will c^use the Investigator to fall to 
reject the hypothesis of homogeneity of regression In many situations \ 
where It Is false (Type ll error). Thus, ANCOVA will be utilized on manj^ 
occasions when the Johnson-Neyman technique or analysis of partial 
variance (Cohen & Cohen, 1973)^ are appropriate. Faulty Inferences about 
th(3 underlyltig causal model will frequently result. The reader Is 
. referred to Rogosa' s (1977a, b) papers for a presentation of the biasing 
effects of measur^ent error on the Johnson-Neyman technique. A recent 
search of the literature (see appendix) located only one reference 
(Busemeyer, 1980) out of over 400 articles surveyed which treated the 
effects of measurement error on ^tlmators of nonaddltlve or Interactive 
effects In multiple regression (Dunlvant, 1980). It Is fair to conclude 
f^om the demonstrations presented In this section, that In analyses of 
change, the potential for errors of Inference caused by errors of 
measurement Is very great. This problem should be of considerable 
concern to any Investigator who collects test-retest data and wishes to 
exp.lain change In scores during the Interval. In the following we review 
several examples of the magnitude of the bias In b^j as estimators of bj. 

DEMONSTRATIONS OF BIAS CAUSED BY ERRORS OF |IEASUREMENT 
Several statisticians have constructed h3rpothetlcal examplis-s to. 
Illustrate th^ kinds of problems caused by errors'^^f measurement. 
Cochran (1968) provided the coefficients reproduced as Table !• 



II - 16 



42- 



Inspection of the entries reveals that in the two predictor case when 

■ 2 and ■ 1, and that the estimator q b'^ and b'2 niay 
simultaneously underestimate both and or overestimate one and 
underestimate the other depending upon the respective ije 11 abilities of 

and X2« There is even one example (r^ ■ .6, T22 " 1*0^ ^^^2X2 " 
where h*^h^2* example constructed by Bornstedt and Carter (1971) . 

the observed-score estimate and structural coefficient had opposite signs. 
If r^^ . r22 - .81, r^^ - 1.0, r - .7, and ^^^^^ « -031, t^en 
b'^ - .03 while bj^ » -.186. ( • ' ^ 

A different approach to demonstrating the bias in partial regression 
coefficients has been pursued-by Cotder-Bolz (1978), Hanushek and Jacksoa 
(1977), Ladd (1956), Marston and Borich (1977), McLean, Ware and McClave 
(1975), and Porter (1967). These statisticians have employed Monte Carlo 
or simulation techniques to ge(nerap^e data which conform to a structural 
jmode'l whose parameter values are specified a. priori . Hundreds or 
I thousands of samples of simulated observations are then analyzed by OLS 
regression and the mean and variance of the resulting b'^ are compared 
with the preset b . Thus, Hanushek and Jackson (1977) generated 100 
samples of 200 observations each from a model with parameters 
bo * 15, hi " j|, b2 ■ 2, and rxj|^X2" ^* When r22 equaled .8, 
the mea^n estimates were b'^^ ■ .99 and b"2 " 1*31. In another 
simulated experiment the^eans were .97 aifd ^36 for b*^ and b'2» 
respectively, when the reliability of X2 was lowered to .4. 

Corder-Bolz^ (i978), Me^Lean et. al. (1975), Marston and Borich (1975), 
and P^ter (1967) investigated the ^f f ects of measurement error in the 
covariate on tests of adjusted grou^ differences in ANCOVA. Mcl^ean et al« 



IlV 17 

43 



(1975) varied the reliabilities of the cjovarlate within and ac^cu^^-^^^Q^xp^, 
the Sample size, and the mean differences on the covariate between the 



experimental and control groups in a 6 x '2 x- 2 x 2 f actorl,al/ANCOVA^ 

/' 

design. Both the empirical alpha level (Type I error probabilities) and 

/ 

empirical power (1-Type VL error probabilities) of the hypothesis of no 

■ . . ■ ^ y ■ . ■ ■ ' 

^justed posttest differences were Evaluated for 2000 sets of generated 

^ * - i 

observations. The results indicated that if the groups' pretest means were 

equal (no pretest-group factor .Correlation) , then the nominal alpha values 

/ 

were not significantly disturbed by errors in the covariate as would be 



expected (Kenny, 1975; c Overall & Woodward, 1976a, b). However, if ther 
was a pretest-treatment correlation, then the nominal alpha values were 
greatly affected. In general, the fallibility of the covariate resulted in 
^an utfderadjustment of posttests differences so that the empirical alphas 
exceeded the nominal alphas. With reliabilities in the .5 range, Type I 
errors were made irf 40% to lOOZ of the samples depending upon whetljer the n 
per group was 10 or 100. For all conditions empirical power differed 
significantly from true ^ower. Sometimes the empirical power was 
significantly lowe^and -sometimes it was significantly greater than the 
theoretical value. According to McLean et. al. (1975) "the most dramatic 
result* (was) that where the experimental group actually experienced a gain 
and the control group did not and the pretest mean of the experimental 
group was less than that of the control group, the adjusted posttest means 
indicated that the control group was better" Tp« 550). 

After conducting extensive simulations in which reliabilities, 
pre-posttest correlations, covariate-treatment correlations, and treatment 
effects were varied systematically^irough a wide, range of ^alues, 
Corder-Bolz (1978) concluded "that the models traditionally used to 



II - 18 



evaluate change [Including ANCOVAJ can produce seriously distorted results" 
(!>• 975). Porter (1967) also conducted extensive simulations. His- work 
indicated that the bias in ANCOVA estimators and significance test could be 
very large. In contrast to these findings which accord closely with the 
derivations presented aboye, Marston and Borich (1977) repprted that in _ 
tKeir Monte Carlo investigation of ANCOVX with an unreliable covariate, the 
tests of adjusted group differences did not exceed the nominal alph: level, 
even when the pretest means differed. It is difficult to explain this 
anomalous result assuming that their data generation procedure performed as 
they expected. Using a complete, over-identified, nondynamic two-equation 
model Ladd (1956) generated 30 samples of observations on'^^o endogejAeous 
and two exogeneous variables. The re^jLabilities of the variables ranged 
from .74 to .92. When* the two regression equations were estimated 
separately by OLS in i:he 30 samples t^dd found that the average b*j 
sometimes was greater. than and sometime^ smaller than their respective 
structural parameter values. The average' l^ast squares bias ranged from 0 
to 31% for the eight regression coefficients' across the 30 samples. 

To summarise the results of the Monte Carlo demonstrations, in four of 
five investigations, the findings of the simulated data^ were congruent with 
the proofs and derivations, presented in the previous '^ect ions. Thus, the 
evidence is quite substantial that if the observations are actually 
generated by a mechanism modeled by Equations 1-$ (or 7-9), OLS. regression 
estimates derived from observed scores will lead to erri^rs of inference 
because of the errors of i^asurement. • 

Before concluding thls^ction it is instructive to note that several 
writerTJi|x?e^^ the possible bias due to the "errors in variables" 

in terms of zero-order and partial* correlations. Since these correlations 



11-1,9 

45 



can be written as functions of the b^, the same bias will be observed in 
the r' as in the b'j. Consider the one predictor case where the 
zero-order correlation between true pretest and true posttest can be- written 

sxj 

• '^Xi - bi (38) , 

and the corresponding estimator based on fallible data is 

Since b'l is less than bi, r'yx^^ will be less than ryx^ except for sampling 
error. Bohrnstedt and Carter (1971) have constructed extensive tables 
illustrating the possible "attenuating" effects of measurement error on 
^YXi* Psychological researchers have been cogniza'nt of these kinds of 
problems since Spearman's (1904) classic paper. 
% Thq case for the coefficient of partial correlation is directly 
analogous to that for the partial regression weight. Fallowing DuBois 
(1957,^. 137, Eq. 80) w^^ite 

' ^ gyxi " fa X1X2 Qyx2 ■ (40) 

/ 1 fa X1X2 ®xiX2 / " "5 yx2 ^y^2 

\ 

Clearly, the observed partial correlation will be subject to the same 
distortions as are t^e partial regression weights. Cohen and Cohen 
(1975) have" furbished seveflal examples, which are "reproduced in Table 2, 
of kinds of bias in partial correlations that can result from errors of 
measurement in the partialled variable. 



Insert Table 2 'about here 



II - 20 



1 



In al]L of Cohen and Cohen's examples the reliability of the pretest 
(the part/lalled variable) is .7 and the post test and research factor 
reliabilities are i 0. * The partial correlations may be 4.nterpreted as 
the correlation of the research factor with change. In the first example 
vthe observed, correlation' of the research variable with change is .00 
while the true or striicttural coefficient .equals' -.23. In other examples 
the observed ryx2,xj^ underestimates and overestimates, and in some it even 
has a different sign than rYX2«xi* Regardless of the perspective taken 
the same conclusion seems to obtain; errors in variables will bias 
statistics based on observed scores as estimators of the underlying 
structurarl model and are likely to^lead to erroneous statements about the 
determinants of change. Additional demonstrations of the effects of 
measurement error have been of f ered^ by Brewer, Campbell & Crano (1970), 
Campbell & Boruch (1975), Campbell & Erlebacher (1970), Evans & Anastasio 

(1968), Hummel-Rossi & Wein'f^g (1975), Kahneman (1965), Linn & Werts 

i 

(1973), and Lord (1963). 

REGRESSION VERSUS STRUCTURAL COEFFICIENTS 
We pause brijefl^ in our review of the effects of measurements errpr 
in jitudies of change to reconsider the question of formulating a 
structural model. An applied orientation which has a long tradition in 
psychology and education disagrees with the importance accorded the 
structural model by this reviewer (cf . Draper & Smi^h, ^966;, Graybill, 
1961; Lumsden, 1976; Mars ton & Borich,/'l977) . The position of this 
applied tradition is that the variables of interest are the observed 
scores (errors included) because decisions, predictions and evaluations 



II • 21 



r 



) 



are based, .on, bbserved rather than true ^scores. When decisions are based 
on observed scores, there is little doiibt about the validity of this 
position* ^ 

It should be recognised, however, that with respect to 
decisiog-making the true score has been redefined as identical to the 
observed score. JLandom fluctuations in scores are no- longer regarded as 
error but are treated as part of the inherent variability of the 
predictor. Thus* if one Is trying to predict the observed final status 
or gain or one Is attempting to model economic 'decisions where judgments 
of producers and consumes are based on observed^^v^ues (Johnson, 1972), 
then the OLS regression estimators are unbiased and consistent for the 
parameters of interesx. This last statement is subject to one 
qualification: if the structural model accurately reflects the causal 
mechanism and a group -of individuals are selected for study by some 
nonrandom process independent of the pretest, then the structural 

coefficients will provide the optimal estimates (Warren, White, & Fuller*, 

~. ^ " ■ 

1974). In this "context inferences from regression analysis of the 

observed data must be limited to randomly drawn samples. How«ver, it has 

aiso been demonstrated that i^ selection into the treatment groups in an 

ANCOVA design is made explicitly on the basis of .the. pretest scores, then 

M 

the observed-score ANGOVA estimates are unbiased for the structural 



parametes (Goldberge^^, 1972; Kennir, 1975; Overall & ftoodward, 1976a; 
Rubin, 1977; Weisberg, 1979). This usually means conditional 
randomization where the probability of assignment to a group for each 
value of the pretest is explicitly determined by^ the experimenter. When 
the treatment groups pretest distributions do no overlap, we have the 



id 



II -,.22 



regression-disconttnulty design which has been advocated by Campbell 
(1969). 

In almost all eucial science research, particularly Studies of . 
change, the structural conception is the more appropriate. The 
structural model represents the causal, structure or theory of the data. 
Research concerned with theory iind hypothesis testing, should be r 
conceptualized in terms of the underlying dijtnamics of the behavioral 
processes under consideration. Hanushek and Jackson (1977) are 
particularly lucid on this £ssue: "Structural equations . . represent 
the way in which we believe the observed data were generated, i.e., the 
underlying behavioral and stochastic processes that led to the observed 
data. The structural representation corresponds to the theoretical 
models underlying the analysis an4 relates to the formulation of ,the 
model where a priori information about specification pr coefficient 
values is relevant." (Hanushek & Jackson, 1977, pp. 227-228). 
Furthermore, if on^ is. interested in testing competing theories, then the 
structural models should be estimated aince 'the theoretical models apply 

to latent or true var/ates and not the observed values. 

• \ 

In another sense the s.tructural coefficients may be taken as mdre 
basic or fundamental than, those derived from the observed-score 
distributions (Goldberger, 15[73; Hanushek & Jackson, 1977). The 
parameters of the distributions of the observed scores can be expressed 
as f.uttctions of the structural parameters, for example. Equation 34. A 
change in the value of one structural coefficient can change the values 
of sevefal or all of the observed-score coefficients. T^ius, if we record 
changes in the observed-score estimators as different samples are drawn 
(e.g., males and females or 1965 and 1975) :we have little wa^ of 



4S 



ascertaining the coinponent(s) of the theory on\ which they differ. The 
implications of these facts are clear: most bet^yioral research, 
particularly that called "basic" research, should be conceptualized and 

A • \ " • 

analyzed in terms of structural models^ We now take-up some issues 
related to the estimation of structural parameters. 



IDENTIFICATION REQUIREMENTS 
Identification of a statistical model refers to the capability of 
uniquely determining the value of .each hypothesized component in the 
model. For linear statistical models all of the inf ormatiop^contained in 
the observations which is available for the estimation of parameters is 
contained^^ the "^kriance-covariance* matrix S^^ . The number of 
parameters which can be identified is equal to the number of unique 



elements in - , which is equal to p(p + 1)11 whtfre p is the total 



number of vaij|ables represented in the matrix. Let us develop the case 
considered previously where there are fallible ptS^^est. and porjttest 
scores related as in Equation The covariance matrix of the 

observable vector (y, Xj^) in. terms of parameters rather than sample 
statistics is ' • 



^ yx 



bi2 8xi2 + 8^2 + 



^1 



(41) 



There are three observable quantities, Sy2, Sxj^^, Syxj^, which can bte used 
to uniquely identify the parameters Sxj^^^ 3^,2 51^^. (For present purposes 

r 

we ignore the fact that information about the observed means can .be used 
to identify b'^ as in Equation 14.) However, the structural equations 



II - 24 



50 



are defined In terms of five parameters. Since there are only thjree 
variances and covarlances, the model caqnot be Identified without two 
independent restrlctl^ons. I£ the measurement error variances 

(Suj^^j 8u2^ ®v^) reliabilities (ryy and rn) arj^ known a 

priori, then the coefficients can«^ restricted to these values leaving 
th^ three ^unknown parameters estimable from t;he observed variances and 
covariances. 

It is easily proved that when there is only a single measure of each 
latent variable and no information about reliabilities available that the 
jStructural model is underldent if led and a priori restrictions must be 
Imposed in order to make th6 parameters estimable (Johnston, 1972; 
Kentall ^Stuart, 1961; Mandansky, 1959; Werts, Linn & Jbreskog , 1973; 
Wiley, 1973). This is also Illustrated by the two-predictor case 
described in an earlier section. The structural model given by Equations 
3, 4, 6 and 22 contains eight parameters (s^^^i ^ui^* ®U2^> ^v^> ^e^ 

b^ and b2) whlli^ there are only 3(3-fl)/2 - 6 observed variances and' 
covarlances which can be used to identify the model. Without a priori 

Informatlod about the error variances (or som^ function of them, e.g., 

' 2 

the reliabilities) or restrictions on the model (e.g., s ■» 0), the 

' i - 

model is underldent if led ^d cannot be estimated from sample data. 
In general, there are three methods by which the information 

■ ■ l 

necessary for identification may be prpvlded: (1) actual values or 
estimates of the structural parameters may be determined from previous ^ 
investigations, (2) the theory may restrict some of the parameters to be 
zero or t*o equal other par-ameters (e.g., b'^-^ - ^xxi"^* <^ (3) multiple 



II - 25 




^ne^ures or indlcatoris of the latent variables or true scores may be 
colljected (J5r®skog, 1973; Wiley, 1973). This final alternative may be 
.thought of as imposing a factor ^jtructure on the observations and, as a 
method, has excited much promising new research In psychometrlcs, * 
sociology and econometrics (see Aigher & Goldberger, 19^7; Goldberger & 
Duncan, 1973). However, In this report our exclusive concera will be 
with the first and second methods for identifying structural models with 
fallible variables. 

LINEARITY CONDITIONS j 
Even when he model can be identified by the methods just descried, 
research on the causes of change with fallible variables faces an ^ 
additional problem ))efore estimation can proceed. That is, rationalizing 
or testing the assumption that the relationship of the observed depe'ndent 
variable to the observed^ndependent variables remains linear when the 
underlying structutal relation is linear (Cochran, 1968, jl972; Kendall & 
Stuart, 1961; Lindley, 1947). Here linear means linear in the 
(straight line) rath^J:han linear in the bj. If tVe structural 
relation is exactly linear of the fom^^ 

^ Y - 6o + biXi + b2X2 + e ^ . (42) 

aad the Xj and e^are independently distributed with E(Y Xj) « 0, does it 
follow that Y » b*o + t'l^i + b'2X2 + e , ^' 

with E(y I Xj) " 0, is also exactly Linear? "The answer is, in general, no 

# 

only under^certain quite stringent conditions will linearity be 
unimpaired" (Kendall & Stuart, 19^1, p. 413). 

Lindley (1947-) has determined the necessary and sufficient conditions 
or the relation to .remain linear in the narrow sense, i.e., where x and 



are not Independently distributed^ I£ assume that the model 

0 

Specified by Equations 1^9 holds with the assumption that the errors are 
mutually and serially uncorrelated, then 

iff 

where the are Fisher's cumulant generating functions - c«g«f«s« - 

(logarithms of the charac teri^c function's) of their suffix variables 

(Cochran, 1968, p. 650, Eq. 8.3; Kendall & Stuarc, 1961, p. 417, Ex. 

29.12; Lindley 1947). Thus, when the cf.g.s. of the are multiples 

of the c.g.f .s. of the u^ ^he relation will continue to be linear. 

Cochran (1972) observes that "roughly speaking, this implies that u^ 

and belong. to the same class of distributions. Thus if Xj is 

2 '2 

difetrlbuted as s , \so is , . . . if Xj is normal, Uj 

muct be normal" (p. 527). ' ^ 

\ ^ 
Additic^nal condltlonis are necessary if we require the to be 

distributed Independently of the residual from regression(e * ) , that is, 

to maintain linearity in the fuller sense. Fix (1949) proved that for 

the case of bivarlate regression if the X, u, v and e have finite means 

and if the variance of either the X or u exists, then both X and u must 

be normally distributed in, order for the observed regression to remain 

exactly linear. * ^ ^ ^ 

In actual data how frequently can we expect Lindley *s £lnd Fix's 

i conditions to hold? Cochran (1972^) argues that "the forces which 

de^*erm±ne the nature jt>t t^e di^stribution of u . . p are quite different 

frdm those that determine the nature of the iJistrll^ution of the correct 



4 



[true] X. Consequently, my opinion is that in such applications even the 
Lindley conditions will not be satisfied, except perhaps by fluke or as 
an approximation ..." (p. 528). He investigated the, nature of. the 
departure from linearity in simple ^bivariate cases where Lindley *s 
conditions did not hold. His results sjjggest that in many situations the 
^linear component^ of the observed-sfcore regression dominates the 
curvilinear components, even with a relatively unreliable . These 
findings provide some- support for allowing "the ordinary theory to be 
used as an approximation" (Kendall, 1951, p. 24). However, Cochran was 
unable to obtain any general results that are exact in the bivariate 
case, and the nature of the departures from linearity in the multiple 
predictor case when lindley 's or Fix*s conditions are not satisfied has 
not been Investigated at all. ^ , 

\ ' 

summary' 

This concludes our initial mathemat;ic^l analysis of the probl^s 
caused by errors of measurement in investigating change with linear 
models. A general sl^uctural model for analyzing change has been 
presented. The theoretical b^/iS and inconsistency in the observed-score 
regression coefficients was proved, and the harmful effects of 
measurement error on estimates of the squared multiple correlation, mean 
square error, and standardized ^egression weights were explicated. We 
described several demonstrations of how large the bias and how incorrect 
the resulting Inferences potentially could be. The interpretation and 
uses of the strucl;ural coefficients were contrasted with those of the 
regression coefficients. We iVjroduced the concept of identlf lability 

and showed how I'fwas essential to determining the estimatibility of the 

/ 

./ 



< 

structural parameter.6. Finally, the known conditions for the linearity 
of the observed-score relation when the' structural relation is linear 
were delineated. 

Statistical developments from econometrics, sociology, education, 
psychology, biometrics, and mathematical statistics were synthesized in 
this chaptjer. ' This is the first comprehensive (yet, hopefully, 
comprehensible) analysis of the problems caused by measurement error in 
linear models for analyzing growth that has b^en made available to 
educational researchers. Its purpose i^ to alert investigators of the 
harmful effects of measurement error and to furnish a detailed exposition 
of. all the major issues. If the objective is realized, future 
longitudinal studies will be designed with greater care and interpreted 
with greater caution* 



II - 29 



5o 



FOOTNOTES 

If 

^ In this proposal we 'treat only single equation estimation 
techniques, so the structural coefficients for the paths between X3 and 
Xi and between and X2 are not considered. See Hanushek and 
Jackson (1977) or Wiley and Harischfeger (1973) for multiequation 
estimation. techniques for the genera path model. 

2 The superscript t will be used to designate vector and matrix 
transposition. 

It is also easy to show that b*o ^ biased and inconsistent 
estimator of bgo After bj, has been determined, bo may be found 
using the formula (Cochran, 1968, p. 651): 



1 



(1 - «uj2 s^j ^ > " j ^XiXj'" ^ b 



j 



It should be obvious that these statements hpld quite generally 
and not only with respect- to studies of change. Indeed, the problems of 
measuremeift error considered in this report afflict all statistical 
models, not just those for anlyzing change. The issu0^ witb regard to 
test-retest data assumes greater theoretical and methodological import 
because of the conceptual status of the partialled variate, i.e., change. 



\ 



O II 30 



\ ■ 



^ Table 1 

Values of b'^ ^nd b'2 when b^ = 2.0 and b2 = 1.0 







• 


^^1^2 


+0,3 




= -0 3 

2 




^22 




.6 


.8 


1.0 


.6 


.8 


1.6 




A 


1,25. 


1.68 


2.13 


- 1.10 




1.87 


.6 




.74 


.66 


; .58 

r 


.44 


.51 


.58 




"'1 


1.20 


1.6^ 


2.06 


1.13 


1.52 


1.94 


.8 




.99 


.89 


>.78 


.59 


.69 


.78 






1.15 


1.57 


2.00 


1.15 


1.57 


2.00 


I.O 





1.25 


1.13 


1.00 


.75 


.87 

-f 


1.00 



^Adapted from Cochran (1968, p. 657), Table 11.1. 



ERIC 



II - 31 



57 



Table 2 

Effects of the Fallibility of a Partialled Variab.le* 



Example 



V2 Vl 



X1X2 



11 



yx2' 



1 

2 l> 
3 
4 
5 



.3 


.5 


.6 


.7 


.00 


.5 


.7 


.5 


.7 




.5 


.7 


.6 


.7 


.If 










.5 


.3 


.8 


.7 


.45 


.5 




.6 


.7 


.42 



^Reproduced from" Cohen and Cohen, 1975, p. 371, Table 9.5.1. 
Note. "For. all examples, = = ^-O- 



JI - 32 



Figure 1 



General Structural Model for Studying Change^ 




Posttest 



-3 , 



Adapted from Wiley and Harnischfeger (1973, p. 48) 



/ 

II - 33 



Figure 2 

Structural Model of Ch^t^|fe including Errors, of Measurement 



2 



JL 




Figure 3 

Typical Regression Model for Studying Change 




CHAPTER III . f 

REVIEW OF METHODS THAT CORRECT FOR ERRORS OF IplASUREMENT 

• " 

INTRODUCTION 

In previous chapter we established the importance of structural 
models for explaining change, proved the bias of OLS regression of 
observed scores for estimating the parameters of the true score 
distributions, demonstrated the potential deleterious effects of 
measu:cement error on inferences concerning the determinants of change, 
and considered the requirements for identif icationvand linearity. Now a 
variety of single-equation statistical methods that h^e been devised to 
estimate the structural equations can be reviewed. Explication of 
multiple-equation models and multiple-indicator structural equation 
models (e.g., Duncan & Goldberger, 1973; Aigner^& Goldberger, ^1976, 
Sorbom, 1978) lie beyond the scope of this report. (However, see Chapter 
IV.) in this chapter our' attention will focus exclusively on techniques 
which utilize a priori information 'about the errors of measurement in the 
estimation process. The objective is to draw together techniques from 
diverse sources, to express them in a common algebra that is synchronous 
with the equations of the preceding chapter, and to analytically evaluate 
them in terms of statistical criteria, such as bias, power, and 
robustness. The derivations and analytic results should prove of value 
to educational researchers who wish to estimate the structural parameters 
of change. In the first section we will consider the original 
attenuation corrections of Spearman and then in succeeding sections four 
multiple regression methods that are suitable for the study of true 
change. 



SPEARMAN'S CORRECTION FOR ATTENUATION 
Although statisticians have been aware of the4)ias in OLS regression 
caused by errors of measurement for a long time (see Adcock, 1878; 
Kuxmnell,- 1879; Pearson, 1901), Spearman ( 1904 > was probably the first to 
derive an expression for the bias and propose^a method for correcting the 
OLS estimators for attenuation due to errors in variables- Without 
presenting a proof. Spearman (1904, p. 90) suggested that a zero-order 
correlation could be corrected for the attenuation caused by errors of 
measurement by dividing the observed correlation by the product of the 
square roots of the reliabilities: 

ryx - ^ ^ 



-XX 



In psychological research the investigator is usually interested in 
the association between two constructs or latent variables and not the 
attenuated correlation between observables. Thus, it is frequently 
recommended that correlations be corrected for attenuation (e-g., Block, 
1963). One dif^icultT^iiTOr^SpBarr^ however, resides in the 



fact that a corrected correlation may exceed l^O, which has~~p?vgtirthe 
technique a skeptical audience (A. P. A., 1966). Such arr outcome may 
result from sampling error in the correlatiop or in the reliabilities. 
However, Bock & Petersen (1975) have developed a restricted maximum 
likelihood estimator of the attenuation-corrected correlation which 
cannot be greater than 1.0. An additional^ problem has been the lack of 
an exact formula for the standard error of a corrected correlation an^ 
procedures for hypothesis testing. Approximate formulas have been 
offered by Shen (1924), Cureton (1936), Kelley (1947), and Forsyth and 
Feldt (1969). 



Ill - 2 



62 



Utjlllzlng the' finding that the sampling distribution of a corrected 
correlation approximates a nprmltl curve, Forsyth and Feldt (1969) adapted 
Kelley 's (1947 provide^ eatliaates of the standard error and .to 

test hypotheses based on normal distribution theory; According to the ^ 
i;esults of Monte ^arlo studies, their method gives reasonably good 
control of Type I error as indicated by the correspndenc^ of empirical 
and nominal alpha levels • In addition,, the procedure workdd very 
adequately for establishing 90 and 95. per cent confidence intervals for 
^YX" "^^Sj would appear that for q-estions concerning the relative 
stability of individual differences in an attribute, Forsyth and Feldt 's 

(1969) method can be recommended. ^ 

If one id.shes, however, to test the hypothesis of no change or 

n * 

perfect stability in individual differences in a trait that is unreliably 

measured, then a different hypothesis testing strategy should be 

pursued. In a comparison of their normal curve procedure with a 

modification of McNemar's (1958) test of the hypothesis that the ^ 

4 ^ 

population correlation corrected for attenuation equals 1.0, -"Forsyth and 
Feldt (1970) found that the Forsyth-F^ldt modification of McNemar's test 
produced empirical alpha values closer to nominal ^values in a series of 
simulated experiments (see Chapter IV). In the meanwhile Jdreskog (1971, 
1974) has devised a maximum likelihood test that not only evaluates 
H : rvY " 1«0 but evaluates the assumptinos upon which the McNemar 

O XA 

(1958) and Lord (1957) tests are based. However, Jbreskog's covariance 
structure analysis requires multiple measures of time 1 and time 2 
status. The conditions under which either the Forsyth-Feldt-McNemar 

(1970) test of Ji5reskog*s procedure is relatively superior have not been 



/ 



III - 3 



determined. Clearly, both of these procedures have useful roles to play 
in the study of change. When only single pre- and post-measurements .and 
'e5 timalLes t he reliabilities are available , the ForsVth-'^eldt-McNemar 
test should bp used. These procedures are developed more extensively in 
the next chapter. 

' When there are measurements available on treatment or background 
factors and interest centers on the effects of these variables on change, 
an index of effect which has been often tecommended is the partial 
correlation (cf. Bereiter, 1963; Cohi^n & Cohen, 1975; Lord, 1963). The 
partial of interest is the correlation between final status and the 
determinant of change with the pretest partialled out. As, noted above, 
this may be interpreted as the correlation between change and the 
treatment or^ background factor. But if it is computed from fallible 
observations, erroneous inferences may result. To overcome this problem, 
Spea^jaan's bivariate correction for attenuation has been generalized to 
I>artial and semipartlal correlations. 

The formulas for a partial correlation corrected for attenuation were 
first presented by Stouffer (1936a, b) and have since been indepe^idently 
derived or discussed by Bereiter (1963), Bohrnstedt (1969), Bohrnstedt 
and Carter (1971), Cohen and Cohen (1975), Dunivant (1975), Hotelling 
(1933), Hummel-Rossi and Weinberg (1975), Kahneman (1965), Lord (1958,^ 
1963, 1974), Meredith (1964), Mulaik (1971) ,»0 'Connor (1972), Saunders 
( 1 95^>s,^,^and^ Tuckey, Damarin, and Messick (1967). To derive the formula 
for the estimator of the fully corrected partial correlation, one simply 
substitutes into the formula for the true partial correlation 



III - 4 



64 



rYX2 - ^YXi rXiX2 
/(l - ryxi^) (l - rxj^X22) , 

t|^e three estimators from Spearman 

"•a .a 

Tyxi * ^^7X2 - '^xiX2 



v/(ryy) (rii) v/ (ryy) (r22) (r22) 

Simplifying the expression, the final result may- be wrl,tten 

A A A A 

ryx2 '^ll " '^yxi rxiX2 ; . 



rYX2.Xi 



^yy ^11 ~ ^yxi 

) (r22rii - rx3^x2 ) 



The formulas for incompletely corrected partial and semipartial 
correlations, i.e., corrected for unreliability in only one or two of the 
variables, are given in Dunivant (1975). 

Although social researchers have been urged to use these corrected 
rather than the observed-score partia^^S^ they have not been given the 
means to place exact confidence limits, on or to test hypotheses about the 
corrected coefficients. Lord (1924, 1975) used large sample procedures 
to develop the sampling theory of corrected partials. He succeeded in 
deriving an asymptotically efficient estimator of the 
corrected-f or-attenuation partial correlation (which is essentially 
identical to the • one arrived by the substitution procedure outlined in 
the preceding paragraph). Fur^iennore, Lord (1975; Stocking & Lord, 
1974) provides an asymptotic test ^or the estimator utilizing a numerical 
differentiation computer program. Ncft^ing that the estimator could become 
infinite because of sampling fluctuations in the zero-qrder correlations 
of reliabilities, he suggests the corrected estimator "may show very 



large sampling fluctuations if the ^s^mple is too small, especially if 

either Y or X2 is highly correlated with the true score X^, The 

sampling fluctuations (of the estimator) will in certain cases be so 

large as to make the calculation of r™ almost useless " (Lord, 

^^2''^1 

1974, p. 215). 

While the observed-score partial may lead to faulty inferences about 
true. change, it seems that the corrected-f or-attenuation partial may 
prove little better. Howeyer, in situations where the corrected 
coefficient is ^finite and the standard error of the corrected partial is 
not very large as computed by the AUTEST program (Stocking & Lord, 1974), 
inferences should be drawn on the basis of the corrected partial. It is 
clear from Lord's (1974) warnings that investigators would be very unwise 
to compute partial correlations corrected for attenuation and fail to 
estimate their sampling variation. Thi^s problem of determining the 
sampling distributions of the esitimators corrected- for errors of 
measurement will reappear often as the other methods are discussed. 

^ , • ' STOUFFER-LINDLEY METHOD 

The first general procedure .for correcting for errors of measurement 
in the multiple regression case was proposed by Stouffer (1936a, b) and 
developed more formally by Lindley (1947), who proved several theroems on 
the technique. Stouffer and Lindley have been followed by many others 
who have discovered the correction independently or who have explained 
the method, including Bohrnstedt (1969), Bohrnstedt and Carter (1971), 
Cochran (1968, 1970, 197), Cohen and Cohen (1975), Cronbach and Furby 
(1970), rronbach, Rogosa, Floden and Pric^' (1977), DuBois (1957), 



DeGracie (1968), DeGracie and Fuller (1972), Fuller (1975), Harnqvist 
(1958), Hummel-Rossl and Weinberg (1975), Johnson (1963, 1972) ,J Kendall 
and Stuart (1961), Koopmans (1977), Meredith (1964), Theil (1965, 1971), 



Werts and Linn (1970), Wiley and Harnischf eger (1972), and Wiley and 
I 

Hornlk (1973). In our presentation of the method, we will follow Johnson 
(1963,. p. 163ff) closely. ^ 

Consider the two-predictor models presented in Equations 3, 4, 6 and 
22 of Chapter II wherr X, is the pretest and is some determinant 
of change* The following assumptions are made : 

E(ui) « E(u2) - E(v)^- 0 (5) 
ECui^) - 6y^2 E(u22) - Su2^ E(v2) -3^2 ^ * (g) 

ui^NID(0,Su 2) u2'^^NID(0,Su^2) v '^NID(0,Sv2) e'- NID(0,Se2) • (7) 
E(uie) » E(u2e) « E(ve) « 0 • CB) 

E(uiU2) Su-|^u2 E(uiv) =- Su^v E(u2v) « s^^^ (9) 

E(uiXi) =- E(uiX2) - E(uiY) « E(u2X2) « E(u2Y) « ECvX^) » E(vX2) =" E(vY) = 0 



We recall from Chapter II that 



8 2» 2 + s 2 (11) 

^1 ^1 ^1 

s 2« 8^ 2 + s 2 (12)' 

Sx2 SX2 ^ Su2 , • 

and note the following identities from regression theory 

SxiX2 ^ ^XiX2 SuiU2 » (^^^ 



8 



yxi ■ l^lSxi + ^2^X182 ^uxv , , (14)> 



and 

^yx2 " "^1^X1X2"^ '^28x2'' Su2V 



III - 7 6* 7 



'if we assiuae that X- and have a blvarlate normal distribution, 
then Y, X^^ and X2 are multlnormally distributed. Maximum likelihood 
estimators of the population variances and covarlances ar^ given on the 



left sides of Equations 11, 12, 13, 14 and 15. Now if the population 
measurement error variances and covarlances are known a priori, then 
Equations 11 and 12 can be substituted into 13, 14 and 15. This results 
in the following system of simultaneous equa^ons for b^^ and b2 

blCsxj^^ - Su^2) + b2(Sx^x^- B^^^^) - (syx^ - Syuj^) (16) 

^1(8x3^x2 *" ^uiU2) t ^2(3x2^^ *" ^U2^^ " (syx2 *" ^^U2^ ^^^^ 

Solution of these two normal equations produces estimators for the 
structural coefficients fpr the effects of X^ aird^2 ^^^^ change ^ 
which are identical in form to those given by Werts and Linn (1970). (Id 
establish the correspondence, assume that the erlrpr covarlances equal 
zero and use the Identity: 

( 

. ' 8X4^ " Sui^ ■ ^jjS^ xj )• 

The reader will also note that the present formulation is less 
•restrictive than Johnston's since it allows correlated errors of 
measurement. In actual practice, however, the extent of such 
correlations will be determined Infrequently and typically will be 
assmned to be zero. The main point is that the present formulation is 
gei^ral enough to Jiandle such a situation. Inserting sample y and x^ 
variances and covarfances and known population error variances ai|d 
covarlances in Equations 16. and 17 and solving yields maximum likelihood 
estimates of b^ and b2« The Intercept constant andv residual variance 
from regression are found in the usual ways. Thus, no great estimation 



problems are encountered as long as the observations are based on sample ] 
sizes of ap^ioximately 70 or more (cf. Stroud, 1969). This will 
virtually insute that the sampling errors of the observed variances do 
not cause iSgative true score variance estimates. 

We now present the matrix formulation for the general case. Without 
writing down the expre^^ons, \we assume that the assumptions in Equations 
5-10 are generalized to the multivariate case. Then in matrix notation 

i XX i * 1 yx » ^^^^ 



and 



IXX - i XX - i uu ^ ' , (19) 



(20) 



£ YX " ± yx " £ vu 

Substituting the right .sides of the last two formulas for, the 
corresponding quantitiea in the first equation yields 

C 1 XX - i uu)i " ( s. yx - £ vu) • (21) 

Then the solution to the so-called normal equations is 

(ixx - luu)"^ (iyx - I. vu) ^ (22) 

Assuming that the matrix inverse exists, the method will yield a 
unique estimate of the vector of structural coefficients. Of course, b^ 
is a least squares estimator of b^ under the stated assumptions. Lindley 
(1947), Kendall and Stuart (1961) and Cochran (1968, 197a) present these 
results in ^ slightly .different way. According to their formulation the 
estimate of the eitructural coefficients is expressed as a weighted 
function of observed (biased) regression weight vector, the error 
variances and th^ predictor covariances. Since 
\ 



(Sxx - S^u • Jbcy (23X 

and 

' 1 XX - 8_ xy . , * (24) 

then 

i 

(1 3CX i uu) b " ixx (25) 

b " (S XX - S S XX b' (26) 

It should be apparent that the procedur^e^^ outlined by Cronbach and 

Futby (1970) for correcting the observed variance-co^lliance matrix lead 

/' 

to the same solution as Johnston's since Sx.^ - Su " ^jj^x ?• ^he 
covariance matrix is standardized, i.e, transformed to a correlation* 
matrix, then rjjSxj^" " rj^ become the diagonal elements which is 

Stouffer's (1936a, b) method. If Sifouffer's corrected matri»/(a 
correlation matrix with reliabilities on the diagonal) is standardized 
(i.e., transformed into a correlatior^jnatrix with unities the 
principal diagonal), then we h^ve a correlation matrix corrected for^ 
attentuation by the Spearman formula given as Equation l/ Meredith 
(1964) described the application of multivariate statistical techniques, 
^.g., canonical correlation, to the attenuation-corrected correlation 
matrix. This method produces ' standardized partial regression 
coefficients which can be resc^led'in the metric of the true scores as 
Cohen and Cohen (1975) describe. It should be apparent that ij^ we 
calculate partial correlations from any of the cpxrac-ted^atrices 
described in this section, the coefficients will equal those calculated 



V 

. .TH - 10 



by the Spearoj^n-based formulas given In the preceding section. This 
Identity suiggests that there may be problems In specifying the sampling 
theory for the estimators of the structural parameters computed from the 
observed covarlance matrix corrected for errors of measurement* 

There are at least three potential problems when one wishes to 
establish confidence limits and test hypotheses ^bout the structural 
coefficients. First, sampling error i\k the observed variances and 
covarlances may produce a corrected matrix which is nonGrammlan and 
therefore not admlssable as a covarlance matrix (Bock & Petersen, 1975; 
Puller & Hldlroglou, 1977; Williams, 1959). Second, the population error 
variances or reliabilities will be* estimated from prior* studies rather 
than being known. Sainpllng errors in the error variances can have the 
same damaging effect as sampling errors in the observed variances and 

covarlances. Furthermore, this i:ontri but es another source of variability 

A, 

to the estimated regression coefficients *which is^ not represented in the 
formulas for the standard errors of the by Thus, confidence intervals 
and hypothesis tests will be approximte at best (Cohen & Cohen, 1975; 
Fuller, 1977; Warren, White, & Fuller, 1974). The third problem which 
may be encountered is the exacerbation of the first two problems by 
multicollinearity in the data, specification error in the model, etc. 
Although some have recommended the use of the standard formulas fo'r 
calculating F- and t-ratios and standard errors (e.g., Cohen & Cohen, 
1975), these potential problems should give the researcher a skeptical 
attitud^ when interpreting such statistics. It may be noted that if an 
expression for the standard error could be written which included 
Information about the sampling error of ^the error variance or 



III - 11 



reliability. Stocking and. Lord's AUTEST program could be used to provide 
an asymptotic t^t. , \ 

Before concluding this section we consider the issue of analyzing the 
effects of treatment or background groups on change via the 
Stouf fer-Lindley method. As in a regular multiple regressiojEi analysis, 
information about group effects may be represented in a variety of ways 
by means of dummy-coded vectors. Analysis of covariance can be handled 
in this way as described earlier on; but what are consequences of ^ - ^ 
including dummy-coded variables in the corrected covariance matrix? Four 
questions arise when we try to evaluate the effects of treatment or 
background group characteristics on change by means of diimmy-coded 
variables. First/ what is the consequence* of violating the assumption of 
multivariate normality which will result from the inclusion of dummy 
variables? Second, can the procedure of testing for heterogeneity of 
regression slopes^ by means of product vectors be extended^ to^this ^^ 
attenuation-correction' method? Third, what are the effects of 
heterogenous error variances or reliabilities between groups on the 
corrected estimators? And, finally ^ how. can error of measurement 

I 

(classification) in the group factors be incorporated into the analysis? 
Particularly in field or quasi-experimental studies, errors of this kind 
may be present , "e.g. , racial or ethnic group membership, SES status, or 
culturally disadvantaged. Another way of stating this problem is that 
the selection rule for assigning individuals to groups may not be exactly 
known. Recen^t progress on this problem has been made by Aigner (1973), 
Cronbach et al. (1977), Games (1975), Mouchart (1972, 1977), and Murray 
(1971). To conclude this section we remark that even though the . 



Ill - 12 



Stoutfer-Llndley method has been around for quite some time, there is 
still a great deal to learn about its performance in practical 

) ■ . ' . 

STROUD'S METHOD""" 

f I 

Stroud (1968e 1972, 1974) has developed an asymptotic chi-square test 
of the hypothesis of equality of conditional means aAi variances of true 
scores for two groups, which Is based on unrestricted maximum-likelihood 
estimators described by Wald (1943). The estimat/or of the covariance 
matrix for the latent variables is the same as that, for the 
Stouf f er-Liadley jiethod* Howet^er, the covariance matrices for the two 
group^are each scaled in the metric of the error variance which is 
assumed to be the same for both groups. This implies that when the group 
variances on x^ differ, that the reliability of x^ for groups 1 and 2 
must not be equal. This is in accord with the traditional psychometric 
assumption that it is the measurement error variance r^her than the 

..4 

reliability (or, equivalently , tffe true score variance) which is 
it^variant across samples of a population. 

The rescaling of the covariance matrix leaves all of the 



scale-invariant statistics unchanged, e.g., r or /r-r „ 2 ; 

tfiJs, the standardized results of regression analyses by the 
Stouffer-Lindley and Stroud methods are identical, e.g., correlations, 
standardized paJtial regression weights, and significance tests. 

However, the s/ale-dependent statistiirs are not unaffected by the 

/ * 2 ' 

rescaling. Therefore, the b. or _ will not generally agree 

between the two methods. This reviewer recommends that Stroud's 

\ 



HI - 13 ,^ _ 



estimates of the raw partial regression coefficients and the^ residual 
variance be rescaled to the me^tric of the true j scores (to conform with 
the Stouf fer-Lindley estimates) for purposes of interpretation and 
description.^ 

the contribution of Stroud's method is that it provides a 
significance test for the homogeneity of regression for two sanples. 
Thus, it could be used to draw inferences about the differences in change 
between two groups'. Unfortunately, the generality of the method is 
limited by the fact that (a) sampling error in the estimates of the 

' variances of the measurement errors is not taken into account, (b) the 
method has been formulated for only the two-sample problem (although its 
author comments that the generalization is straightforward), and (c) 
separate tests of the intercept and slope parameters are not given so 

I that tests analogous to ANCOVA's tests of differences in adjusted 
posttest means (or gain) are not possible. A strength of the method is 
that a multivariate generalization, i.e., multiple dependent variables, 
which uses Lord's AUTEST program, has been developed (Stroud, 1968, 1974) 

FULLER'S METHODS 

During the past decade Fuller and his associates have devised several 
procedures for correcting for errors of measurement. We wili presejit In 
detail his most general formulation (Fuller, 1980) and then the related 
techniques more briefly. 

The model and assumptions posited by Fuller (1980) for his case (i) 
are virtually identical to those stated for the Stouf fer-Lindley method. 
Particularly, errors in the model (e) as well as errors in variables 



III - 14 



74 



(v^u.) are permitted ami asauioed to follow a multivariate normal 
dlstrrlbution. Furthermore, covariances between errors may be nonzero. 
Fuller (1980, p. 7ff ) defines the estimator of the structural parameters 
as 



i " <i+ £ i -1 (i + a i 



where a > 0 is a fixed real number 



(27) 



H 



ixx ^ uu 



uu > 



if g ^ 1 + n -1 
if g < 1 + n -1 



(28) 



lyx ~ £ vu 



if g > 1 + n "1 
if g < 1 + n "1 



S > 



S. yx 

A. 

£ XX 



g Is t)ie smallest root of [ S - gl j * 0 , and 
^ 2 



^ vu 
®uv ^ uu 



(29) 



(30) 



(31) 



This estimator of b^ differs from the Stouf f er-Lindley estimator in 

two important ways. First, the modification associated with g 

"guarantees that H is a positive definite matrix, that the estimator of 
2 

s^ is positive and the estimtor of b^ possesses finite variance . . . 
The a-modif ication gives an estimator that is similar to the 'k-class' 
estimators used in simultaneous equation estimation" (Fuller, 1980, pp. 
7-8). This latter modification produces smaller MSE of the coefficients 
in finite samples. 

These modifications represent twov-important advances over the 
Stouffer-Lindley technique. Furthermore, Fuller (1980) provides, a proof 



T 



of a theorem on the limiting distribution of which demonstrates that 
the estimator is asymptotically normal and unbiased and specifies the 
covariance matrix of the estimator in the limit so that large sample 
^ confidence limits and hypothesis tests are possible. This limiting 
sampling distribution assumes that the population measurement error 
variances are known. For a more restricted model, however, which assumes' 
uncorrelated errors and incorporates the a-modif ication. Fuller and 
Hidiroglou (1978) provide an estimator pf the limiting distribution of b 
which includes information about the variability of the reliability 
estimates if available. Under the more restrictive conditions, Fuller^-*«> 
and Hidiroglou (1978, Theorem 1) prove that 

n (b - b) ^ > N(0, S xx"l G S_ xx"^) (32) 

where the elements of £ are functions of the. true XjXj covariances 2 

J Si* - 



and the ratios of error variances to total variance, e.g., 
If the Ij are arranged in a diagnoal matrix as 

kuu " ^^^8 (^11> ^22» • • • » Ijj) » 

and if there is information about the sampling error of the 1 



(33) 



available in the form 



then 



where 



n ll"^ Ci - 1) ^ > N(0. P) 



^ n (| _ b) ^ > N(0, S xx'l £ £ XX"-'- + 111) . 



) 



I - diag (bisxj^xi"^ . ^28x2x2 » • • • . bjSxjxj 

For the first (and only) time in this review we find an expression 
(Equation 35) for the sampling distribution of b which reflects the 



(34) 

(3H 
(36) 



ERIC 



III - 16 



7o 



additional source of variation in bj Introduced by the imprecision in 
estimating r, . or s • (See also corollary 3.2 in Fuller (1975, 
p. i:j7). Of all the techniques that have been considered, Fuller's 
methods appear to produoe estimators with the most desirable properties. 
It is hoped that researchers involved in studying change will begin to 
use these estimators so that the usefulness of F^v^er's methods in 

practical size (samples can be evaluated". Fortunately, computer programs 

- ** ^ ■ 

are available for performing these analyses (Hi'rdiroglou, Fuller, & 
Hickman, 1977; Wqlter & Col: by, 1976). A program for performing Fuller's 
disattenuated regression program written in the OMNITAB programming 
language as part of this research (Dunivant, 1978a) appears in the 
Appendix. 

Two additional procedures which have been developed by Fuller and his 
associates deserve mention before leaving this section. Models of 
curvilinear regressions with fallible measurements have been proposed by 
Wolter (1974), and Wolter and Fuller (1977a, b). These methods may be 
useful in describing the growth curve for an attribute as a function of 
level of initial status. An analysis of covariance model has been 
developed by DeGracie (1968) and DeGracie and Fuller (1972). Their 
procedure ccmtains ^^^SeiSif ication similar to the g-modif ication described 
above which guarantees the existence of the variance of the 
pooled-within-8;roups slope estijaatoxi The estimator of the slope 
parameter in ordinary ANCOVA is defined for the observed variables as . 

Vr 



b'l 



. • ^ (37) 



III - 17 



I ( 



Where these are pooled-wlthln^groups coefficients based on the fallible 
scores. The estimator of the structural parameter which represents the 
slope of the pre^test on posttest regression following the 
Stouffer-Lindley method would be 

. big^ - — _i: — - ^ ^(38) 

• DeGracie and Fuller (1972) Investigated the properties of two 
estimators of the slope which we present here under the assumption of 
E(uv) " 0: 

b DFl " — s— s- , (39) 

1> ■ ^ 

where 

; 2 = P^l^ " ^^1^ ^^f " > (l/ni)Su^2 

\ (l/m)su^2 otlil^rwise 

and m equals N minus the number of groups minus one. The second 

estiro.ator was suggested by an examination of the bias in b, and is 

DFl 

given by 




/^2sui2 + _J_ + y 



where p " a times the number of replicates per group and m (as above) Is 
a fixed positive number. De'bra(^cle and Fuller (1972) prove that b, 

DF2 

h48 a smaller bias and smaller mean square error than b, . They 
present an 'extension of classical F-ratlo to test the null hypothesis of 



no adjusted group differences In final status or gain* Although this 

Inferential device does not take into account tH^ sampling error of 

2 ' 
s^ , it is probably a much more appropriate test statistic than 

the uncorrected F for investigations of change. 



PORTER'S METHOD 

The final method to be reviewed was fomnulated by Porter for his 

doctoral research in 1967 and has been cited frequently since then. The 

technique has been more fully elaborated in Porter and Chibucos (1974) 

and Olejnick and Porter (1981) • A very similar method was proposed 

2 

independently by Hunter and Cohen (1974). Porter's method, called 
estimated true score analysis of covariance, has much intuitive appeal: 
one uses the traditional psychometric formula to estimate each 
individual's true pretest scor« and then substitutes it for the observed 
score as the covariate in ANCOVA. 

The estimating equation for true scores for person k is given in most 
psychometric texts as 

Xk ^xx (xk X ) + X , (41) 

* c 

which is ^simply a linear transformation of x* Thus, for stat:^stical 
procedures that are invariant to linear transformations, e*g«, multiple 
regression, correlation, ANOVA and ANCOVA, use of either x or X produces 
identical results. For example, b^^ ■ by^ and r^^ » ^YX* 

In ANCOVA, however, all individuals may not be sampled from the same 
population, so that there is not a common mean x on which to regress the 
observed scores. For the case of pre-existing or nonequivalent groups. 
Unequal pretest means are usually observed and offer the possibility of 
regressing an individual's pretest score toward his or her group (h) mean: 



III T 19 



Whenever pretest mean differences occur, Equations 41 and 42 will 
yield different results, and Equation 42 no longer specifies a linear 
transf onaation of x across all samples or groups represented in the 
study, i.e". , *< l^O. Wert s and Linn (1971) discuss the choice of 
Equations 41 and 42 as depefidiTig u^on whether the group means are 
considered fallible. If so, Equation 41 will regress them toward the 
grand mean. But since doing this produces resul.ts identical to those 
obtaned with the observed covariable, use of Equation 41 will not allow 
us to estimate and test the structui^al parameters of interest. Thus, 
despite claims to the contrary, Porter^s method will not pro<iuce the 
desired results if there are no covariate mean differences. 
Additionally, if the covariate means are deemed Unreliable, i.e., 
E(u) ^ 0 within groups, the method of estimated true score ANCOVA fails, 
since it is equivalent to^he observed gcore analysis. (It should be 
noted that the method in these cases will yield the appropriate estimate 
of b^; hoti/ever, or the estimate of adjusted mean differences will 
be biased and equal to the OLS estimator b*2 0 

What are the properties of the estimators obtained by Porter *s method 
when there are pre-existing differences between the groups? First of 
all, the pooled-within-groups estimate of the structural slope parameter ^ 
will be properly estimated as b^. Second, since the estimated true 
pretest group means equal the observed group means^the estimate of group 
effects or differences between adjusted posttest means will correspond no 
the structural estimator. For the two group case Porter's metho^d yields 
the following estimator of the treatment effects structural parameter: 



O III - 20 

; ERIC V 



t - 72- - yi, - bl (X2. - XI.) - ^[2y^Yi. - bi (X2. - Xi,) . (43) 

The final estimator of interest in ANCOVA studies of change is the 

* >^ 2 2 

mean square error or residual, variance, s^ ^ - " a . It is 

J^'Xj e I >^ 

easily shown (e.g.. Porter, 1967, p. 36) that Porter^s method yields a 
biased estimator of the residual variance; in fact. Porter's estimator is 

\ 

identical to the OLS estimator Sg'^. since Sgf^^ Sg^ generally, the fit of 
structural model will not appear as good as it really is. The hypothesis 
test of the between-groups factor should be conservatively Mased by use 
of the inflated estimate of MSE. However, this bias might! be offset by 
the increase in MSE due to estimation of r^.^^. The trad/-off between 
the upward bias in the residual variance and the faiiyre to include the 
sampling error of the reliability coefficent as a source of variation in 
the estimators of the structural p^arameters could be sufficient to make 
the method work reasonably well in actual research.*^ 

The results of Porter's (1967) Mqnte Carlo studies indicate that the 
empirical alpha values for the estimated true-score ANCOVA Jj-test of 
adjusted posttest means only slightly exceed the nominal alphas, e.g., an 
ejnpirical alpha of .075 compared with a theoretical value of .05. This 
does not seem unreasonable in view of the fact that Porter'' s estimated 
true covariate scores were based on reliabilities calculated as 
test-retest cdrrelations based on sample sizes of 20 to 40. POrter's 
Table 20 (1967, p. 100) indicates that the sampling distribution of this 
reliability estimate can be badly rkewed for moderate values of r and 

XX 

have large standard errors. It would be reasonable to infer that the 
variance of an ANCOVA estimator' derived within the framework of the 
Stouffer-Lindley method, e.g., Cohen & Cohen (1975) or Cronbach & Yhxhy . 



Ill - 21 ' 

8. 



(1970), would be underest ImatedT-somet imes substantially so — because of 
the failure to incorporate the sampling error of the reliability of the 

■ ■ - • ^-'th-'k " 

pretest. This-f-act would contribute to a liberal bias in the usual 
F;-ratio based on the structural estimators, i.e., the null hypothesis of 

no group effects on change will be rejected too frequently. 

■ ■ . . • ! 

Porter's method canot be extended directly to the multiple regression 

case, because there is only a single sample. In that case estimated true 

and observed scoyes would be perfectly correlated. However, something of 

*> 

the logic- of inakitig the ^reliability corrections within groups has been 

. ' v. ■ 

captured by Hunter and Cohen (1974) in their estimated true-score 

multiple regression analysis. Following Lord (1956), thfey obtain,-. 

multiple regression estimates Of based on and X2 a^d of X2 ^ 

based pn x^^ and X2» (See Hunter & Cohen [1974, Appendix II] and Lord 

[1956] for the expression for weights associated wlthlxAand X2O 

Although Hunter and Cohen (1974) do not develop the saiflpMng theory for 

their estimators, they provide a general mode|. that will handle 

c,urvi linear relations. 



, APPLICATIONS OF METHODS TO REAL DATA 
In this section we review analyses of actual data'made by means of 

one of the correction-f or-attenuation methods just presented. Are 

inferences about factors which affect change different for the 

observed-^core and structural models? 

Several test-retest data sets have been analyzed by both the 

Stouf f er-Lindley method and OLS regression of the observed scores. 

Dunivant (l977a) examined the relationships of type of nursery school 



EKLC 



III - 22 



82 



program (treatment), age and sex (baclcground characteristics) to change 
In sex-role identity over a nine-month Interval. A composite test based 
on five measures of sex role Identity with an estimated reliability of 
•75 was admloistered to 400 children in September and again in May. 
These data were submitted to a multiple regression analysis of the 
observed scores and to the correctipn-f or-attenuation regression' 
procedure described by Cohen and Cohen (1975). The results of the two 
analyses^dif fared in many Important respects. There wa^ even one 
Instance of sign reversal for one of the treatment-background factor 
combinations. That is, the observed score analysla indicated that one 
combination of factors significantly facilitated change while the 
structural analysis indicated that the combination inhibited growth. 
This example clearly demonstrates the errors of Inference abO'^t change 
that can result from errors of measurement. 

In*a major reanaXysls of Project Follow Through data ^i^lng Cohen and 
Cohen's (1975) method, St. Pierre and Ladner (1977) found no effects 
which differed in sign between the corrected and uncorrected ANCOVAs. 
However, the percentage of chAnges in Inferences about treatment effects 
on gain from the "observed scor^ to the corrected ANCOVA was as great as 
212 when a pretest .rellabllty of .6 was assumed, e.g., from null to 
positive, negative to null, etc. The means of the stAdard. errors for 
their ^significance tests ware all smaller than those frota the uncorrected 
ANCOVA. They concluded that "correction of the pretest for assumed 
unreliability can lead to changes in the cqnclusions that an evaluator 
reaches in terms of the rank order of sponsors as well as the overall p 
level of program effectiveness (across sponsors)" (St. Pierre and Ladner, 
1977, p. 21). ^. ^ 



III - 23 ' 



7 



Stroud (1972) used his method of comparing regressions based oh 
fallible data to determine If the pattern of change In school achievement 
differed for males and females between thevt^nth and eleventh grades', 
The^^sults of his asymptotic test for equality of regressions aiad 
coqrfitlonal variances did -not differ from those of the uncorrected 
regressions. * ^ / - 

Several applications of Fuller-' s methods have been presented-. Only 
one Is -directly relevant to our present concerns (see below). Rindskopf 
(1976) used DeGracle and Fuller's (1972) ^dlsattenuated ANCOVA method to 
reanalyze data from the national Head Start evaluation (Clrclrelll, 1969) 
and Glass's (1970) evaluation of ESEA Title 1 programs. When low 
estimates of pretest reliabilities were used, the DeGracle-Fuller method 
yielded different conclusions t;han did uncorrected^ ANCOVA for both ^ 
evaluations. Specif Ically, Head Start produced significant .gains In 
Metropolitan Readiness Test scores for black children according to the 
DeGracle-Fuller tests but not according* to traditional andblysls of 
covariance. The analysis of Title 1 reading scores using classical 
ANCOVA indicated a significant negative effect of participation. When 
lower-bound estimates of reliability where inserted in DeGracle and 
Fuller's method, however, the differences between control and treatment 
groups were not significant.* Rindskopf 's (1976) rea^alyses provide 
another important demonstration of the potentially deleterious 
consequences of measurement error in drawing inferences from analyses of 
observed change using analysis of covariance. 

The other applications of Fuller's methods are not suitable for our 
purposes because either they do not relate to change, or they do not 



,Er!c •, "^-^ 8.i 



report comparisons with observed score regressions. However, the 

Mir 

examples do demonstrate that ^he b j can either underestimate or 

•overestimate the bj (Fuller and Hldlroglou, 1977; Warren et al., 
1974). Addljilonally, the Warren et al. (1974) analy^s of ma'nagerlal 
role perfdxrmance yielded estimates all of which had larger standard 
errors than- those from the corresponding ordinary r^^ession. They 
suggest that this result is almost always to be expected. When comparing 
the ^rfoxrmances of Porter's (1967) and DeGracie and Fuller's (1972) 
methods with real data, Rindskopf (1976) observed that the sampling 
variances of the Degrade-Fuller estimators always Exceeded those of * 
Porfcer. These results provide some support for^Warr^ et al.^s (1974) 
speculation.. Contrast this with St. Pierre and Ladner's (1977^ decrease 
in standard ej-rorp with the Stouf f er-Llndley method. ^ 

Rindskopf (1976) also provided a demonstration of the use of Porter's 
(1967) method with Head Start and Tittle 1 ESEA data. He found that the 
corrected results, from Porter's method contradicted those from classical 
ANCOVA in the sa&e ways that were described above for the DeGracie-Fuller 
method. Porter's method appeared to be more powarful than that of 
DeGracie and Fuller leading Rindskopf (1976) to recommend it, especially ' 

in situations where the covariate has low reliability* This conclusion 

I* 

must be regarded somewhat skeptically, however, sinc^^j iTt appears that 
seme of Rindskopf 's corrections where invalid, since they created 
nonGrammian covariance matrices. Although the Degrade- Fuller method 
insures that such impossible matrices will not be constructed. Porter's 
ANCOVA correction method does not. Researchers wishing to apply 
correction methods In order to estimate true-score effects must be 



III - 25 



careful with both the Porter and Stouf f er-Lindley methods to use 
reliability estimates that do not generate impossible data. 

Olejnlck §nd Porter (1981) Recently pointed out some important 
considerations in applying Porter's correction method &ud additional 
illustrations of its application. Porter and Chibucos (1974) furnish a 
hypothetical example in which the observed score and estimated true score 
ANCOVAs lead to different inferences. They conclude that the estimated . 
true-score ANCOTA should be used to evaluate change, when the pretest Is 
fallible and pre-existing differences between the groups obtain. 

CONCLUSION 

This concludes our review of the major correction-f or-attenua^tion 
methods which can be used test-retest studies of change where Information 
about th6 reliability of the pretest is available. We have cc^llected and 
analyzed methods from statistics, education, and the social sciences. 
The methods of Porter (1967), Stroud (1972) DeGracie' and Fuller 

(1972) can. be used in situations appropriate for the .analysis of^ 
covariance. Of these. Porter's and DeGracie and Fuller's procedures have 
the more general applicability. The exactness of Stroud's method, \ 
however, strongly commends ic for the two-gruup design. The | « 

DeGracie^Fulier method appears less. powerful than Porter's but this- 
disadvantage may be \>ff a^t by reduced bias and greater safety. 

For the more .general multiple regression kinds of analyses (Including 
^VA), researchers may select one of the Stouf f er-Lindley or Fuller 
methods* It seems clear that for data which conform to the usual 
assumptions of normality, homo'scedasticity , etc., the statistical 



III - 26 



estimation and testing .procedures developed by Fuller (1980), Fuller and 
Hidiroglou (1978), and Warten et al. (1974) will prove superior to the 
Stouf fer-Lindley methods. Not only are Fuller's methods safer in the 
sense that they preclude the estimation of singular covariance matrices 



(of the 'prediyCtor variables), they yield significance tests Which ar^ 
v^ilid for finite samples. We have been unable, however, to establish 
analytically which technique possesses greater power. This issue is 
ad4resfiid in the simulation studies reported in Chapter VII. ^ 

It can be concluded from our review that most of tlie problems 
associat ^.d with estimation pf the true-score regression weights have been 
solved by the proposed methods. The unsolved problem of greatest 
importance Involves the unbiased estimation of the sampling variances of 
the-disattenuate'd regression coeffidients and the validity of associated 
significance tests. This chapter has helped to clarify and refine thes'e 
issuett, and the simulation studies reported below add further insight. 
Questions involving the type of reliability estimate to use and the 
heterogeneity of regressions constitute important__probleE^--thHr need to 
be addressed in future research. 

The applications of the correcti'bn .methods to real data amply 

illustrated the kinds of errors of inference that may have resulted from 

. ..» J 

errors of measut'ement in previous investigations of educational change. 
It is hoped that the explication and evaluation of the 
attenuatioa-correctMon methods provided in this chapter wAll encourage 
and facilitate their use In future studies. 




Ill 



- 27 "( 



/ 




^Note that'' this is in contrast to Stroud's (1972) suggestion that the 
appropriate bj. are those from thej rescaled covariance matrix. 

^An approximate method, which is very similar in definition to Porter's 
has been proposed by Corder-Bolz (1978) and evaluated in simulation studies 



Ill - 28 ho 



CHAPTER IV 

ESTI^IAIION OF LINEAR FUNCTIONAL RELATIONS ^ 
INTRODUCTION 

The purpose of this chapt;er Is to analyze the problem of determining 
if a perfect linear relation exists among two ortiore variables and to 
review some statistical methods that havebeen developed to estimate and 
test linear functional relatlcns. By definition, a linear functional 
relation (LFR) exists If the trjue scores on two (or more), measures are 
perfectly correlated/ Although most of the statistical work on LFR has 
been done by econometrlclans, a problem has been Investigated which Is 
formally Identical to LFR In the field of psychometrics, 
Psychometrlcians' have developed several statistical tests of the 
hypothesis that two scales measure the same attribute except for 
differences in means, units of measurement, and standard errors of 
measurement (or reliabilities). When scales satisfy these conditions 
they ar€ said to be equivalent or congeneric. As is demonstrated below, 
equivalent tesjLs are related by a linear functional relation. The 
correlation between equivalent measures, i.e., between two variables that 
have a linear functional relation, when corrected for attenuation 
(unreliability) is 1.0. In this chapter the diverse theory and methods 
from econometrics, statistics, education, and psychometrics are 
collected, compared, and integrated. Several new results are derived for 

the errors-ln-variables problem which should prove helpful in analyzing 

j 

change occuring in measures which contain el^rors of measurement* 

Testing hypo^eses about LFR or 'the equivalence of measures has wide 
application in the analysis of change, although this has not been 



recognized heretofore. LFR methods c-ould be utilized in test-retest • 
studies which are designed to provide separate estimates af unreliabiil^ity 
and instability in the measures (see Heise, 1969). Since l/FR represents 
a specific model specification about the relation of th'/i pretest anjj 
po'stt est scores, :Jj.e.., no stochastic error (see Fuller, 1980; Is&ac, 

i 

1970), LFR methods could be used like any of the methods in the -previous 

chapter to estimate and test hypotheses about change. This has been done 

previously in economics but not in educational research. Some of the LFR 

i 

models could'' be usefully applied to the problem of ;J.nf erring causal 
effects In cross-lag panel correlation^ Inythe context of the general 
(LISREL) formulation of change in latent variables presented in the 
second chapter, some of the methods for assessing the equivalence of 
scales could be employed to evaluate the adequacy of the multi-indicator 
measurement models relating the observed to the true scores. Finally, 
LFR methods could provide valuable insight concerning the invariance of 
measurement metrics and -validities over time and between groups in 
program evaluations (see Be jar, 1980). 

It is hoped that this revleu of LFP methods and the derivation of new 
results will prove of value to eCucationai researthers who are concerned 
with the preceding problems and statistical methods. The remainder of 
this aecTtion is devoted to giving a precise mathematical formulation of 
linear functional relations and equivalence. In the next section 
definitional issues concerning various types of equivalent measures are 
discussed. After notation and data^layout conventions hava been 
Introdijced, methods for determining LFR are reviewed. The methods are 
drganized According to the type of information they require. Thus, the 



er|c 



IV - 2 



\ 



/ 



r 

review is divided into procedures that require replicate measures and 

V 

those that use inforinatlon about the variance^gof the errors of 
measurement • 

For. the purpose of expts^^ition let us assume that the following simple 

measurement model holds for two observed tests, x and 

X - X + e^, and (1) 
y - Y + e^ ' ^ (2) 

where x and y are observed scores, X and Y are unobserved true scores or 

latent variables, and e and e ar« random errors of measurement. By 

X y 1 ^ 

definition x and y are functionally related measures if the correlation 
between the true scores X and Y is unity, i.e., they are equivalent. 
Although generally x and y will be pre- and posttests, respectively, the 
model does not require this. 

/ 

The correlation among the latent variables can be estimated in 

several ways. If the reliabilities of the tests are known, then the 

correlation between the true scores (r^^) can be estimated by applying 

Spearman's (1904a, 1904b, 1907, 1910)s^orrection for attenuation to the 

sample correlation -between the observed variables (r ): 

yx 

r (3) 
vx 

^YX " v/'(f ) (f ) ' 

If there are multiple measures or indicators of X (say, x^^ and x^) 
and Y (say y^ and y^), alternative estimators of r^^ are ^ ( 

available. For example. Lord (1957) give^ the maximui likelihood 
estima^^ as 



ERIC 



'(For alternative formulas based on the same covariancejs, see Kelley 
[1947] and'Werts & Linn [1972].) Given a-particular estimate of r^^, 
the problem is to test the hypothesis that in the population r » 
l*Om The null and restricted alternative hypothesis may be writtj^n: 

Hq : r^^ « 1.0. (5) 

^1 • ^YX < ^-^^ 

It is also possli)le to expres's these hypotheses as tests of 
restrictions placed oti the linear model relating Y and X. Recognizing , 
that "Y and X may differ in their means and their scaling or units of 
measurements, we write 

g^ Y « c + g^x 

where g^^ and g^. are scale coefficients and c is a constant' which is a 
function of the differences in ^tieans between Y and X. Sin6e and 
can be absorbed into a new coefficient g » ^i^^2 ^ 



intercept defined as a - c/g^^. Equation 7 expresses Y as a linear 



'transformation of X. Note that the structural model given by Equation 7 
p does not include a stochastic term to incorporate the eJfects of chance 
distrubances or midspecif icat ion into the modpl. As stated, it holds 
that a perfect linear relation exists between Y and X. This is referred 
to as a functional relation irPtha^tatistical literature (Isaac, 1970; 
Kendall & Stuart, 1961).^ The null and alternative hypotheses to-be 
tested are: 

Hq : g^Y - g^X - c - 0, ' (8) 

Hi : giY - g2X - c 9^ 0. ... <9) 

should be apparent that since correlations art invariant over ^hanges 

in scale and origin, the null hypotheses given. by Equations 5 and 8 are 

the same. 



IV - 4" 

■92 



There Is a third way of formulating the model that will prove useful 
in developing some of the statistical tests of equivalence. To Equation 
7 we add an error term f representing random fluctuations in^ the fit of 
the model: 

gjY « c + g2X-+ f ' (10) 

If the model fits the data perfectly, the error iir the equation, f , will 
be identically 0 for all members of the population. Thus, the hypothoj^is 
of equivalence or perfect linear relation among true scores X and Y can 
be evaluated by testing if the variance of f exceeds zero 

V 

Hq : » 0, . (11) 

: 8^^> 0. ^ (12) 

The null hypothesis given in Equation 11 is the same as those in 
Equations 5 and 8* 

A variety of different Ways of testing Equations 5, and 8, and 11 
under various types of assumptions have been proposed. In order to 
explicate these methods and their assumptions we now define equivalence 
and present J'dreskog's model of congeneric tests. ? 

. DEFINITION OF EQUIVALENCE AND CONGENERIC TESTS 
In the development of classical test ^theory the concept of strictly 
parallel tests has played a crucial role. Strictly parallel tesits by 




definj^on are tests which have equal means, equal variances, equal ' 
covariances „an^ equal validities with respect to any criterion. It 
follows that parallel tests have equal standard errors of measurement and 
equal reliabilities. A person has the same true score on parallel tests 
(Gulliksen, 1950; Lord^& Novlck, 1968). GulUksen (1968) argues that 



ErIc '. 9^ 



* " V 

scores on parallel tests are completely "Inter-^hangeable." That is, 
scores from one parallel test can be substituted for those for another 
parallel test without any loss of information whatsoever. Obviously, 
parallel tests satisfy the criterion for equivalence, that Is, they 
measure the same trait except^for errors of measurement. However, other 
tests, which do not meet the rigorous requirements for parallelism, 
likewise satisfy the criterion. For example, tau equivalent tests, i.e., 
tests which meet all of the requirements for parallelism except that they 
have different standard errors and consequently unequal reliabilities 
(Lord fit'Novick, 1968), measure the same underlying attribute. While tau 
equivalent tests measure the same construct , they do so with varying 
degrees of accuracy (reliability). The model for Essentially tau 
equivalent tests goes even further by relaxfing the restriction on equal 

r 

means. ,Thus, an individual will not necessarily have ^qual true scores 
on essentially tau equivalent tests; however, all true scores on one 
essentially tau equivalent test will differ from those on another 
essentially tau equivalent test only by a constant. 

The most general model for tests that measurd the same attribute 
except for errors of measurement is Jbreskog's (1971) theory of 
<;ongeneric tests. In this model almost all of the restrictions on 
parallel tests have been eliminated. The tests need not possess equal 
true means, eli^al true or error variances, or equal reliabilities. 
In!dividuals do not have to have equal true scores ^on congeneric tests. 
This imp Ilea that congeneric tests have difcferent origins, or means, and 
different 6cales,^"or units of measurement. However, true scores on one 
congeneric test are a perfect linear function of true scores on another^ 




IV - 6 



J 

congeneric test. Congeneric tests meet Gulliksen*s (19*68) criteria for 
"scientific equivalence," that is, t ley measure the same underlying 
attribute. In factor analytic terminology congeneric tests reflect a 
single general common factor. Thus, although oongeneric tests dre not 
strictly inter-changeable or substitutable and do not' possess equal 
accuracy, they contain information about the same' latent variable which 
underlies each of the tests. A detailed comparison of the various test 
models is provided in Table 1. Throughout the remainder of this chapter 
we shall 



Insert Table 1 about here 



use the terms equivalent and congeneric synonymously. For most research 
applications it is the congeneric type of equivalence that will be of 
interest. Now these ideas about the equivalence of measures is given a 
more precise mathematical form by developing Jc3reskog*s (1970, 1971, 
1974) theory of congeneric tests. 

Let us assume that there are two replicate measures on each of two 
scales X and y. By definition x^^ is congeneric with .and y^ is 
congeneric with y^. While th^ replications may represent two a^^jter^^e 
forms or test-retest measurements on the same test, in most cases theie 
replications are obtained by splitting tests x and y into halves^ Such 
split halves may or may not be parallel,' but they must be congeneric for 
the following developments to hold. These congeneric replicates or 
multiple indicators are necessary in order to identify the parameters of 
the true and error distributions. Without the^ additional information, 
provided by the multiple measurements the trji^e score correlation cannot 



be estimated or tested against a hypothesized value (e.g., 1.0). The 
reader is referred to the text by Hanushek and Jackson (1977) for an 
vcxcellent Introduction to t^hi^ problem of Identification. ^ 

According to classical theory the observed scores can be written as a 

i> . ■ ' . 

linear composite of true and error components: 

^1 " ^1 ^1» ^ (13) 

^2 " ^2 ^2' (1^) 
y3 - Y3 + e3, (15) 

y4 " '^A ?4 / (16) 

where X^^, and are true scores and^he e^ are random 

errors of measurement. Under the classical a^l9Ain^)tlons true scores are 
not' correlated with errors, and errors are all mutually uncorrelated. . 
Since by definition the correlation between cogenerlc tests Is 1.0, 
^^1^2 " ^^3^4 " Therefore, new random variables X and Y can bew 

defined which are perfectly linearly j^elated to the true scores on the 
individual tests X^^, X2 and Y^, Y^, respectively, as follows: 

- m^^ + b^X, (17) 

' ^3 " ™3 ^3^» (19) 

. Y^ - ,+ b^X, A (20) 

Substitution of Equations 14 through 17. into Equations 10 through 13 
yields the congeneric measurement model: 

- m^ + b^X + e^, ' (21) 
X2 - m2 + b^X + e^l , ' ' (22) 

^3 " °3 ^3^ ®3' ' (23) 

y4 " "4 + V ^4' • ■ " " (2^) 



IV - 8 



tamatplx form this result be written 



/ 





• 






^1 


0. 




1 


"^1 












o" 




X 


• 


^2 


^3 


■I 






0 


^3 




y 


+ 


^3 


74 








0 










2L 


■I 


m 


4 


B 






+ 


e^ 



(25) 



where X Is a vector pf observed scores foy an individu^ m is a vector 



of means, Is a matrix of scaling coefficients, £ Is a veCbc^ of true 
scor^, an(k £ Is a vector of errors of measurement.^ 

Wlthp;it loss of generality we take the true scores to be^ expressed In 
standardized forms ( jf (X) - ^ (Y) • 0^ VAR (X) - VAR (Y) - 1.0) . 
Then the structural model relating x to X Is 
- g2X, or alternatively 
„ _ 81 



V - 



*■ ~ g2 — X " gX, and 
the covarlance matrix of the vector t is 



(26) 
(27) 



£. XY 



l.Of rxY 
ryx 1.0 



(28) 



From JSreskog 's- (1970) covarlance structure analysis the covarlance matrix 
of observed variables can be written as a function of the parameter 
matrices . ' 



bi 0 
\ 0 
0 b^ 
0 b. 



'^11 


«12 


^13 




«21 


822 


S23> 


«24 


«31 




833 


«34 


^1 


•^2 


"43 


^44 



'1.0 






b-j^bjO*. 0 




1.0 




0 0 bj^bj 



IV - -9 



9/ 



8 0 • 0 0 
2 

8 0 0 

2 

8 0 

^3 



0 . 

^ 0 

0 



0 
0 



8 



or 



S- 



xy 



(29) 



It 18 also ^088ible to 'specify the-^arametric structure of the mean 
vectqr within iureskog's (1970) Analysis of COVariance St^ructure model 
(ACOVS). Let D he an N X 4 matrix of scores on tests x^, x^, 
and y^ from rf'sample of size ^, E an N x' 1 matrix of ones, and G a 4 x 

. 4 mat'^rlx specifying constraints on the «jean vector. Then the population 

^me^ns are structured as, _ . 

jt (D) - " E m' G. ( "^^^ 
When .G is the identity matrix, ;S (D) - m ' . 

' ' If X and y are cogoueric, then x ^ will equal 1.0. That is, 

o /a - r - - 1.0. Thus, testing the hypothesis that two 

tests X and y measure the same attribute^except for differetzces in means, 

.units of measurement, 'and errors of measurement (or that x and y are 

scxentif icany equivalent or that they have a LFR) reduces to testing the 

hypothesis that r ^ ' 1.6. Under cogeneric assumptions the parameters 

_ a s . 8 are free to assume any 

^2' ^3' \\ °1' °3' "4* \' %' \' \ ' 
real finite values. A path diagram for the cogeneric model is presented 

' in Figure 1. Since the true scores are standardized, the coefficient g. 
must equal. 1.0 when x and y are cogeneric. 



Insert Figure 1 about here 



IV T 10 



98 



Before concluding this section a final point should be pad ^ 
concerning the rank of the matrix S ^y. If Xp yg, 
all congeneric, "then Equation 25 can be rewritten as 



*1 




>i' 










'^1 


X2 














^2 


^3 










t 


+ 


^3 


:\ 


^ » 












/4 










bt 




+ 





and Equation 29 as 



S^y - b b' + S 



(31) 

(32) 
one c ommon 
one* 



which is formally equivalent to a factor analytic model with 
f actor. When r^Y is Wty , the rank of S ^y and S ^ equals 
Testing the restriction tliat r^^ ' Y-O is equivalent' to test 
a single factor model fits the data (Gulliksen. 1968; J«resk|.g, 1974) 
The path diagram for the one factor model is depicted in 1 .^ure 2 



Insert Figure 2 about here 



ERIC 



DATA ORG^IZATION AND NOTATION' 
In order to facilitate our comp'arison of the several /statistical 
procedures for determining LFJl and the equivalence of meisur.s. a copnon" 
data layout and not-ation will be employed for all the procedures. First, 
we assume that there are measurements on' two tests x aid y which, have 
been split into cogeneric halves: and X2,^d U as 
was described above, thus, the cogeneric measurement/ model specified in 
Equations \8 through 21 holds. The scores for N persons, on the four ^ 
tests are org anizedj according to the schema p^esent/ed. in Table 2. which 
Use illu^rates the'notational conventions that t/ave been adopted. 



IV 



99 



Insert Table 2 alsout here 

The population covarlance katrlx of the vector^ (x^, x^i x,, Vy y) 
may be expressed in lower triangular form as 



''11 

^2a ^22 



(symmetric) 



\l \2 , ^xx 



«'3r ^32 ^3x /33" 

^41 ^42 ^4x %3 ''44 



^1 \2 V ^y3 "y^ ^yy^ 

The estimator oi^ S derived from a sample of' size N will b'e, denoted S, the 
elements. of which are deviation sums of squares and cross products 
divided by N-1 and symbolized i^ s. Tlie estimator of the population 
, correlation matrix is derived from S and may be written 

1.0 



(33) 



(xj^, xj.-rX, y3»'y4» y'> 



_ 10 (synmetric) 
^21' 

xl x2 

'31 ^^32 ''3X 



^41 ^42 ^4x ^^43 
r 



1.0 



yl ''y2' V >4 



1.0 



(34) 



J 



We will tefer to the entries in S, (x^, Xj. y^. y) and i" 

(x^. x^. X. y3, y,. y) frequently in the sections to follow.^ The 
definitions of the o^er vectors and matrices remain as give^ in the 
preceding sections, e.g., m, B, ^y' 



IV - 12 



400 



* ♦ 

REVIEW OF 'methods FOR DETERMINING EQUIVALENCE * - 
The purpose' of thi.s section Is to 'exKllcate and compai;e seven 
■statistical methods designed for. determining if the true scores from two > 
or more tests are perfectly linearly related. They are divided into 
three sets depending upon the type of information or data required. The 
"first group contaihs\he three b^st methods of those which require . - 
replicate Measures of each scale: JiJreskog (1971). Kristof (1973). and 
Lord (1973). Gulliken (1968) and Dunivant (1979) have reviewed other ' 
less optimal procedures in this group. 

In the secbnd set- are three methods which as.sume inf ormation is 
available about the covariance' structure of the errors of " ;easurement . , 
While such information can.be obt aine-a'7;rm replicated data.^ it may come 
from any other 'independent sources. These methods, which were f^ulated 
primarily ty statisticians concerned with estimating and testing linear 
' functional revelations. iiL:lude the methods of Koopmans (1937) and Tinter^ 
(1945. 1946), Fuller (1980). and Jbreskog (1971). 

The third ,et of methods includes only Fuller and Hidiroglou' s (1978) 
:„ethod for testing matrix singularity when independent information about 
the reliabilities of the variables is available.^ The procedure uses the 
reliabimies to adjust the covariance matrix of observed scores in much 
the same way that the" estimates of measurement error variances are 
utilized by the procedures in the second group. Indeed, all seven ^ 
procedures are very similar in logic, if not in mathematical detail: 
eS2:h uses infonnation about € he covariance structure of the observed 
^neasures and errors of measurement (from replicate measurements, error 
variance estimates, or reliability estimates) to esti.,ate the parameters 
' . of the linear functional relation.-^ 

IV -13 ^ '401'' 



To the «tent possible the seme outline has been followed In ^ 
^escribl,« each of the methods. At the outset the statistical Model and ' 
§ Its assuMptlons are statad. Then the null and alternaflve hypotheses 
^ich are tested by .the procedure are specified. Be.'t .e provide the 
, ^putatlonal formulas for the test statistic and describe ho. Its 
significance Is evaluated. If provided by the test developer, th^ 
estimator of rounder a.true alternative hypothesis is presented. ' . 

-Finally, an evaluation of . the test Is mady For example, evidence, which 
< contradicts. the validity of the test IS discussed. If the efficiency of 
a test relative to one or more other tests Ts known, t^e superiority of . 
the meAod Is pointed out. Relevant Monte Carlo results. If available. 
■ are summarized.. He also demonsfrate thai some te.ts dlffe. only In 
- computational methods, ..g.. m th>.way the likelihood function Is " 
evaluated. They are Identical statistical tests In all ,ther respects. 
We begm our conslieratlon wlth'the methods of th. first set which 
require replicate measurements. 

Procedures U?ing Replications 

^Lord-Villegas Test ' . 

In 1973 Lord demot^st rat ed to psychologists how a statistical- 
procedure developed by Villegas (1964) to estimate linear functional 
lat\bns could be .sed to" test "the hypothesis J:hat two sets of 
alurements differ only b^ca^se 'of errors of measurement and because of 
differing, origins and units of measurement" (Lord. 1973. p. 71). The 
assumptions of the Lord-Villegas procedure\xe s.m,marized in Table 3. . 



re 
me 



/ 



4- 



Insert Table 3 aUout heret, 

^ \ • 

which, is taken from Dunlvant's (1979) review. The reader will note that 

in addition to the classical assumptions about tr^e and error components/^ 
theWel requires t^e errors of measureme^t^f rom any pair of tests to 
follow a blvarlat^e normal distribution. However, the errors of 
measurement for tests and may be Correlated wlththose from 
y3 and y^.. Although it is not stat^ed'by Lord, Krlstof. (1973) points; ' 
ck that the Lord-yillegas test requires that x^^be parallel with 
and y3 with'y^."^ Inspectibh of Table 3 reveals that the Lord (1957) , 
test differs in. assumptions f torn the Lord-villegas method primarily In 
terms of which components are required to be jointly normally : distributed . 
and theCorVelatlon of errors.. Also the Lor^.ViUegas test does not 
depend op sample slze^for its justification. 

The- null h/-pothesls tested by" the'Lcrd-Vlllega^s >proced.ure is precisely 
that the two tests X 'and, J. are congeneric or scientifically equivalent, 

4 i • r - 1.0. The alternative is that the linear re'latlonsl>^p - ' 
x«e« 9 0* XY * 

between the true scores.is less than perfect.^ 
_ la order to perform the Lord-Vlllegas test we must compute^three new 
Qtrices. Let. us define the matrix W, a within persons n,atrlx, to be 



1 



W 



s 2 s 
8 s 2 



(35) 



where the elements In.W are defined as^ 



N- 

2 

"^^W " 1-1 j-2 



IV - 15« 



r 

103 



N 2 — .^_v 



(38) 



i-X 



j-3 



The reader will recal;! that all of the symbols are defined in' Table 2 

except ii. and yi. vhich equal tUn + ^^±2)'^] Kyil +Vi2)/2]. 

respectively. " - 

JHext the among'persons sums' of squares and cross products matrix A is 

written " ^ 



(39) 



where 



/ 2 



N 



N 



2 



N 



2Z (x. - - x.O (y^. - y...) 

i-l ' , . 



zZ (y^. - y-'-f ^ ' 
1-1 



• (40)) 
(40) 

(41) 



-l^ow ^e^ select the significance level at which we wish to evaluate Hq. 
say .05. From the -tabled values of F we find the 1-05 - 95 percentile of^ 
the F .'distribution with N and N degrees of freedom. Finally, we evaluate 
•the determinant^ of jnatrix £ of/orde^ 2: ' 

^ , : ■ |£i - ^95i i • . ^ ^''^ 

The null hypothesis of equivalent measures is rejected at < .05 
significance level if the determinant is positive and if both diagnoal . 
terms are also positive, i.e.. "H" is rejected if and phiy if the 

W , • ' ■ 

t IV - 16 

\ 




matrix C is positive definite"^ (Lord, 1973, p. 71). Lord (1973) 
explains thit the test is sUghtly cdusfervative in that a true null will 
be rejected somewhat less often than the value would indicate. 

•Simulation_experiments have verified -this (see Dunivant, 1979). -However, 
fhf method.is almost as powerful as Kristof 's (see below), which has the 
greatest power aihong those procedures^hich .have been compared. 

The control of Type I error, power, computational simp ilc^, and • 
somewhat less restrictive assumptior^s of the Lord-Villegae procedure as 
'compared kth several -otlier procedure, (cf. Table 3) would seem to 
commend it to general u^e. .However, Lord. (1973) oautior.3 that' the 
> procedure may be very senBliive to- correlated error's within tests, i.e., 
when r • ^0 and r 0. ,For example, positively correlated errors 

will tend to increase *he eleiaehts of W and consequently to decrease th. 
probability of rejecting Hq, , -The extent to which this'procedure is . 
affected by violations of assumptions concerning measurement error 
correlations, linearity, and normalitV is unknown. Since 'educational and 
psychological data will ^f t.en.iail "to satisfy such assumptions, the' " 
' robU^.ness of -the "Lord-Villegas test /is an important question. 

-. Rrlstof 's 'Test ' ' . 

^ Kristof '8 (1973) method' for testingT' if a perfect linear relation " 
. existsbetween the true scores X and Y represents a significant 

liberalization of the assumptions of the parallelism of x^ and and 
of y* and y^ required by most of the procedures based on 
replications. In fact. Table 3 shows tliat thi. method makes only three ' 
assumptions: i)-that the errors of measurement within scale x and within 
• scale y are uncorrel^ 2) that the errors are not correlated with true 



XV'- 17 '^^^O. 




scores, and 3) that the errors are mult inonnally distributed. Kristof 's 
teat appears tb 'tequlre fewer restrictive assumptions than any o< the 
Other methods. " . * 

^ . ^The null hypothesis is "that two variables have perfect disattenuated ^ 
Jqrrelation. hence measure the same trait except for errors of 
•eeasuremW. This hypothesis is equivalent to saying, within the , adopted 
model, that true scores two psychological tests satisfy a perfect ^ 
mear relation" (Krist of . 1973. p. 101). This may be written as: 

Hq : kjX. + a2Y + aQ - 0 (for a^, ^ 0). (^3) 
The reader should notice that either a, or will be less than zero, 
i.e.; will be negative, ujder H,. Equation 43 c^n be rearranged^ to - 
clarify the nature of the perfect linear relation: 



Y - 




^2 



(45) 
(46) 



Obviously, if f (Y) - ^ (X) - Oand var(Y) - var(X)^. 1.0. then ^a./a^ - r^. 
The alternative hypothesis ^lolds that Equation 43as nonzero. 

in order to drive^ Krist of 's3)est statistic we define two new 

variables, fj^ and f25 ' \ 

f 1 - Vl-^ ^2^3 " ^1^1 ^3 ^1^1 

f 2 ■ ^1^2 V4 " ^1^2 * ^2^4 ^ ^1^1 V4^ 
^Refer to Equations 13 through 24 for definitions of the variables.) 

Kristof observed that when is true, f^ and will correlate 

exactly zero. Thus. Hq can be reformulated as Hq: r^'^f ^ - 0. 

If i ^ exceeds the critical value of r^^^ ^ -for a 

prespecified ^ level, then H, is rejected.. The test is conservative 
according to Kristof (1973) so that "if rejection of occurs, then 



the true corresponding, level R will not exceed ^ ^ <^ " (P- 
108)/^ In oJ^der, to teat Hq, we compute the minimum possible value of 
r ' given the data subject to the restriction that 
I , J* 0; Letting r^^^ be the, minimum value of r^ ^ .Yompute 



a 



t - 



N-2^ 

/^^^^ml7 



If the sanple value of t exceeds the tabled value of t for N-2 df , then 
we reject the hypothesM of equivalence of x and y. A one-tailed test is 
performed because of t|je asymmetry of the alternative hypothesis, H^: ry^'<X'0' 
^ ^Kristof (1973) describes several tests which are based on dif f arent 
aissumptions concerning' the parameters of the error distributions. 
However,' we shall develop only ^ least restricted model. As a first 

A 

step in computing ^^^^ we rearrange the rows arid (polumns in S to 
form S ' . 







\ 




If 




^12 


«14- 






ill 


' S 
' -12 




^31 


g 

^33 


=32 


^34 




s - 


il2 


! hi 


m 


^21 


^23 


^22 


^24 




4 

\ 




1 




«41 


«43 


'^42 


■«44 


*- 



Next an eigen decomposition- of P"formed to yield orthogonal 

matrices P_ and T of order 2: , ■ 

" - - - * 

Jn the next step new matrices Q and U are found as follows: 



S S„P t"*^ - (q,T,) for j - 1,2; k - 1,2 , and (49) 

— — 11— — J*^ 



U - S • P' S22ll''^ 



(u,. ) for j - 1,2; k - 1,2. (50). 



ERIC 



IV T 19 



107 



Finally^ we 8ol\^e the quart Ic trigonometric equation 

• . h^cot^ V + hgcot^ V + h^cot^TLt h^SOt V + h - 0, (51) 

^ , (52) 

where * h^ - q^iUii + '^U^IX* 

^ h ' 'lll^^Za - "11 ^^22 - 'lll^ ';12"12' . 

hj - 3 [qi2("22 " "11^ "12 ^^^22 " ^^ll^^ ' ^^""^ 

H - ^122 <"22 - "11>^'»22 - 'lll^ " '^^"n' ^^^^ 
. ' ' _ - (56) 

1 ^0 ■ "<ll2"22 " 'l22"l2- . - . 

There will be four solutions of this equation, two of which must be 
rteal. ■ From the largest root of the quartic we find r^^^ which Is then 
u«,ed in the foWa for t (Equation 47) to evaluate the null hypothesis. 
Although Equatix)n8 46 through 56 may appear formidable, they are' 
' easily and quickly solved by standard computer programs e.g., IMSL ' 
(1979). Most computer installations will have a program for solving 4th 
degree polynomial equations in a single v^riahle, say d. To solve 
' Equation'si, let d - cot v and use the program to obtain the largest real 
root, say p, of . 

. . h^ d"* + h3 d^ + ■h2 d^ + h^. d + hQ - 0. - (57) 
Then the corresponding root s' of Equation-51 can Le obtained by uslnfe 
the inverse trignometric function to solve: p' - ^^ccot p. It appears 
that Kristof's (1973) method represents ah efficient procedure for 
testing if r^^ - 1.0 under very liberal assumptions. Although the test 
^ is conservative, it is valid in small sample applications. It has 
performed as well or better- than other procedures in Monte Carlo 

a 

studies. Its efficiency is especially pronounced with small sample sizes 
Thus, Kristof's (1973) procedure possesses some real advantages over 

those methods already reviewed. To summarize, the test does not require 

/ 



IV -20 



the parallell^ of and and of 73 and 74 and permits some 
betwfeen-test .error correlations. Large samples are not required to 
justify the validity of the t^st. Widely-av^lable standard comptiter 
programs for solving polynomial equations oan be used, to commute the 
necessary test statistic quicHly and inexpensively. In addition, 
Kristof '8 procedure conveys a tangible benefit to the user, when x^, 
x-2 and yj, y4 are not parallel. The degree to which Kristof 's test 
is robust has yet to be determined, however. 

Jgresko^^s Tfest - '' 

Jareskog* 8 maximum likelihood method for estimating the parameters of 
and testing hypotheses about covarlance structures provides a very - 
flexible:approach for investifeating the equivalence of measures. As can 
be gleaned from Table 3 the method Is based upon the large sample , 
properties of mitimum Ukelthood estimators and Ukelihood ratio tests 
subject to classical test thfeory assumptions and the milltinormailty of 
the observation vector (xp 73, y4>- Parenthetically, we • 
mention that the general ACOVS orCOFAMM models can be defined so as to 
relax the classical assumption of, uncorrelated error-a. However, more 
' replicate measures on x and y will be required in order to identify the 
model.. It is obvious from Table 3 that JHreskog's method compares 
favorably with the procedure of Kristof (1973) discussed in the last 
section. However, as yfW- be shown below JBreskog's technique allo^^s 
greater flexibility. becausVlt allows one to test a variety of 
rest rictioM and hypotheses. ! 

For the purpoaes of' this review we ^re interested sp^lfically in 
testing two-different null hypotheses within the frameworlt^Jareskog's 



XV - 21 , los 



ACOVS model. The f ir^t^ is that ty.^ - 1.0 which we now write as 

(for equivalence). The second null is that for each variable the .ha3f 

tests meet the assumptions of equality of units" of measurement and 

.standard errors for parallel tests. Thus Hp symbolizes the null 

hypothesis that Xj^ and are parallel and thar ■y3,.and §re , 

also, parallel. Of coarse, JSre^skog's method is completely general so 

that an 'interested i .vestigator could test, assumptions of 'the tau 

equivalence, or of the parallelism of four obs^ved variables, i.e., • 

Wi'lks (1946) test, etc. Although in this section equality constraints on 

• the means or origins of the measures will not be considered, the reader 

should appreciate that JUreskog's general COFAMM model readily permits 

tests about the structuring' of the means tas was illustrated in a 

previous section). There is typically little interest in dlfiferences in 

means between t6sts„ so this issue will not be pursued here. After 

considering^ the computational formulas^ we shall describe how two 

alternative tests of Hg and Hp may be formulated and evaluated. 

J-dreskog^B (1970) general method for analyzing civariance strtiStures - 

assuiaes^that the population c^^ria^f ^itili S has the /orm given 

in Equation (21)' which is reproduced hete for the reader's convenience: . 

S - B S ^ B* + S ^ . " (29) 

— xy ♦ XY — ^ ee 

A covariance matrix of this Structure is produced when the observed 
variables are structured as/Equation 25 (reproduced here): 

X"«m + B^t_ijje^ • (25) 

. Three kinds of parameters may be contained in the parameter matrices B, S 
and s : (t) fixed parameters that are assigned a priori 
values, .nn«t:rained parameters that are unknown, but equaKo one or 



IV - 22 ■ 



more other patameter^. and (ill) free parameters that are unknown and 
unconstrained. ' • 

The problem is to find *e8tiinat;e8 of the constrained and free" 
parameters which maximize the likelihood of the sample values g'iven a 
model o\ the form of Equation 21. For most applications simple analytic 
solutions do not exist, so JBreskog (1970) uses the numerical metrhod of 
Davidpn (1959) and Fletcher and Powell (1963) to. maximize the likelihood 
function.- JBifeskog 'argues that compared with variants of the 
Newton-Ra^hson t;chnique. this is an efficient" procedure which makes use 
of the derivatives of the likelihood function and the inverse of the , 
information.matrix. Actually. Jbreskog (1970) finds it more convenient 
fo minimize a^f unction 0.- which is equivalent to maximizing the logarithm 
of the' likelihood functiod L: 



1 ^ ■ 

A . 1^ —1. , 

+ tr (S S V ) - log 



S - J. (58) 
- xy 



0 - log I S 

where S contains the maximum likelihdod estimates of estimated' under 
— xy 

the model 'specified by Equation 21 and J is the number of observed 
varieties. 0 is a function of the independent elements in B, S ^, and ^ S 

» 

In large apples, N-1 times the minimum value of 0 is distributed as chi 
square and may be' used to test the goodness of fit of the model. In . 
• addition apprc^iiaate .standard errors may be obtained for each estimated 
parameter from the inverse Of 'the information matrix computed at the 

minimum of 0. , , • 

Hypotheses ^e tested in this approach by the likelihood ratio 
technique. -The ACOVS^ or COFAMM programs compute a chi square value 
for each specif ied -model against the most general alterxlati^e . that 
S is arw positive definite matrix: 




ERIC J "/ IV - 23 11 



, ■ • ' L (S_r) • / (59) 

- -2 In r ' • 

L (Sp) 

where L (S ^) represents -the likelihood, under a given specification of 
fixed, free'and constrained parameters (Restricted model), and L(S p) . 
is the likelihood' under the assumption that S is ariy positive 
definite matrix (Full model). According to JBreskog (1970,) it is 
possible to test any given model.. say M^^. against a more general 
^alternative'. say;Mj^^. by estimating and testing each one separately 
(against the most gLral alternative th.t S^^ is any p.d. matrix) and 
comparing their f goodness of fit values. The difference in chi 
square values is aysymptotically chi square distributed with degrees of 
'freedom equal to the corresponding differehce in degrees of freedom 
between the two models J 

y 2 .^2 . 2 . J, (0 ■ - O/) . 

with ' ' ' (61) 

- in general, the n^ber of degrees of f^^edom on which any chi square t^t 
is based equals' tte differ/nce in the\umber of parameters estimated 
Kinder the full and restricted mbdels. ' ^ I ^ 

^ With this introduction t^ JU^skcg's methcid we^can'no^ explicate the 
hypotheses (yodels) of interest in ipvestigations seeking, to deteUne 
the equivalence of measures. Following J«reskog (1971)^we suggest four 
models which could be tested: 



2 2^2 2 ^ 0 



* • ' ' IV - 24 



/ 1X2 



* " 2 '2 a „ 2 



M3 : r 1.0 



M 



'4 \^xy 



S is any p.d. matrix of rank 2 witfi the elements of B, S.^^, and 



S all free. 
— ee 



Each of these four models is testad Sgalnst the most general model: 



Mc S ^„ is'any p.d. matrix. 

J — xy 



This series of tests is illustrated in the upper portion of Table 4 where 
the numbers of parameters and degrees of freedom are indic,ated. To" test 
the hypothesis that tfie two tests x and y are eq^ivqlent (H^: r^y - 1-0) 
we could consider the goodness of fit of ^er Models 1 or 3. Hp, the 



Insert Table 4 about here 



null 'hypothesis that x^. ^2 and yj. y^ are parallel, could be tested by 
Model 2. Howaver, JBrekofe (1974) maintains that "the value of should 
be' interpreted very cautiously." He suggests that it is more informative 
to test the reasonableness of any restriction "by fitting two different 
models, one of which contains the restriction, the other of which does 
not. "The differences between values matter rather than the JL 

values themselves" (J'dreskog, 1974)., ' , 

In the lower portion of Table' 4 are presented four model comparisons 
which yield t^sts of Hg and Hp., The differences .in chi square values, 
for the indicated models yeild te^s of the following null hypotheses: 
Ml V. M2: Given ^1,. Xg and yg, ari parallel, test if x and y are 
congeneric. . . - 



/ IV - 25 



113 



•Mj V. M^: Given Xj^, x'^ and are congeneric j test if x and y ar^ 

•congeineric. ' ■ • 

, . i ■ 

M« V. M,: Given X and y ar6 congenetlc, test if x^, and y^, y, are 
3 \ ^ ^ 

* <p • 

parallel. 

. M, V. M^: Given x and y are not congeneric, test if x^^, X2 and y^, y^^ 

parallel* ^ . 

(Of course other possibilities exist, e.g. tat^ equivalence, and thee can 
be tested easily by the ACOVS or COFAMM programs.) 

We observe^hat the test of the v. M2 comparison is identical 
tp Lord's (1957) test. They differ only tti computing algorithms. The 
compairison of models 3 and> Is comparable to Kristof 's (1973) Case'iii 
which was presented in the prior section. Although the underlyii| 
assumptions and null and alternative hypotheses are' (roughly) the same, 
Kristof 's and 'Jbreskog • s test statistics differ considerably. In 
simulation experiments in which they have been compared, Kri st of ' s met hod 
has been generally more efficient in small samples.' When N exceeds '200, 
'however, JtJreskog's procedure demonstrates great>er power. "The 
availability and ease of use of JSreskog's COFAMM program are certainly 
advantages of this technique. However, serious questions about the 
method's sensitivity to departures from normality remain. Recently* , 
problems have been found with the Davidcn-Fletcher-Powell algorithm whicTi 
COFAMM uses (Lee\& Jennrich, 1979). Thus, it seems premature at this 
point to recommend ACOVS/COFAMM as the optimal large-scale procedure. 

Before closing this discussion, it is worth noting that JSreskog's 
method aff ords'the capability of festing which test model is appropriate 



IV - 26 



114 



for and x^-'and for 73 and y^. Identic 4 tests of parallelism 

can be constTucte using the methods of Wilks (1946), Votaw (1948), and _ 

J8reskog (I97p, 1971). These" aljl producrilkelihood ratio tests, but 

they differ in\ cemputational methods. The ease and flexibility of 
■ . - \ . ' ■ 

JBr'eskog's procedure would seem to recommend it for testing assumptions 

about test scork models, e.g., whether a set of measurements conf.orms to ^ , 

I ■ • ' \ ■ 

the assumptlcjjis of congeneric, essentially tau equivalent, or parallel - ' 

tests. , ' \^ 

In concluding' this description of the ACOVS method we point o^p that 

' ^ it will yield a ML estimate of r^^ "^^"^^E tenable and that it 

easily accommodates the analysis of "several sets pf congeneric tests 
'^.g.,. X, ^y, z) each of which has several replications (e.g, Xj^, X2, 

X V,. y.. 2V» 2-1. 2q)- The null hypothesis of interest in 

3* 4 ^ 5 6 - 7 o , , ^ 

this situation is Hq : " " ''yz " > ; 

. "inter^ting t6 note that ACOVS or COFAMM can be used to test, the „ . 
hypothesis that the correlation for attenuation equals 1.0 in situations 
where replicate measurements on x and y are not available ifvthe 
reliabilities or standard errors of x and y are known (.t£. -Equation 3). 
This will be considered in the next 'section. 

V 

Procedures Using Error Variances 
fcoopmans-Tintner Method 

f The credit for ■developing the first statistical procedure for testing 4 
the hypothesis that true scores have a perfect linear relation by u^ng 
inforaatiori about the covariance stinicture^of th^xerrors of measurement 
must be shared by many statisticians. I attribute the method primarily 
to Koopmans (1937) and T^lntner (1945, 1946, 1950) because* of their 



r 



concern for significance testing and research application'. Building, 
primarily on the work of Rhodes (1927) and van Uven (1930), Koopmans 
(19^7) proved the maitiinum likelihood properties of van Uven' s (1930) 
weighted regression estimates of the parameters of a linear functional 
relation and derived approximate sampling distributions for the 
coefficients.' This work was extended *by Tintner^( 1945, 1946, 1950). who 
used a result of Hsu (1941), to derive'an approximate asymptotic test of. 
equivalence. He applied this method, whiqh in the field 'of econometrics 
is now' commpnly. referred to as the method of weighted regression, to 
problema of multicpllinearity and homogeneous ecdnomic functions. As 
• will be see^. fcW^ethotfe^hares certain identities with several 
multiwiatl techniques,- most iTotabJy factor analysis and canonical 
correlation.. Although' notje of 'its developers we^e concerned with the 
problem of equivalence of measures as defiriei'in this chapter, the 
' weighted regression method permit s\ test of the hypothesis that two or 



more scales differ„only in meansSun^ts of measurement, and standird 
errors pf measurement •* 

To explicate the procedure.' we first form thi covariance matrix of 
th4 total scores x and y. S from the entrie's shown in Equation 33: 



S 

- xy 



^xx 



yx 



xy 



yy 



(62) 



. The covariance structure of the measurement errors for the total test 
scores given in Equations 1 and 2 is defined as / 



ee 



e e ^. 



e e 

y 5c 



e e - 
X y 



e e 

y y 



(6.3) 



/ ^ 



IV - 28 



116 



which explicitly, permits correlated errors. For ease of presentation the 
method Is Illustrated for the case where, there are only two (total) 
scales. The matrix f'ormula):l6n Is completely general, however, and holds 
'for any number of tests; 'i^fe.sumlng that an estimate of S is available- 
and that is known , Ecjuation 7 m^ be estimated by solving the 

two-matrix elgenproblem (cf. Bock, 1975): . 

The elements of the eigenvector £ - (82^/82^ corresponding to the ' 

A r 

smallest root^.u. are LS/ML estimators of 

- c + g^-X . ; .(7) 

^ The Intercept is computed by inserting mean values for X and Y in 
Equation 7 and solving for c; " > ^ * 

^ "'^l^"'i2^ ^ , . • (65) 

Under the null hypothesis (of equivalence) the quantity (N-l)u^ is 

approximately distributed as chi square with N-2 degrees of freedom. 

Anderson (1948) pJcoved that the quantity^, (G^ "^N)/2n followed the unit 

normal distribution for large N. Approximations using the F dlstrlbtttion 

,- * 

have been proposed by various authors as wejl. When the values of either 
of these test statist^Jcs exceeds t^he t'able^d values for the prespetif led 
alpha level and appropriate degrees of freedom, the hypothesis that Y and 
X are equivalent is rejected* . 



The weighted regression method assumes that the population value of S 

. is used in the preceding calculations. . Koopmans and Tlntner both 
ee ^ . , ' 

argue that using an estimate of S will not greatly affect the 

ee 

^validity or accuracy of the structural coefficient estimates as long as 

' " * % 

the variances of the true scores are much great e^r tjian the*error 



dispersions,. i#e., that the measures have high reliabilities. Malinvaud 

(1970) concurs, but cautidns that this and other deductions apply only to 

the asymptotic distribution of the weighted regression. And, as he 

♦ 

points out, "[u]nf ortunateiy there seems to exist ne- study of the 
.properties of this regression for finite^ samples J*- (p. 394) it is not 
known' how efficjlent and 'roimst this method is relative to those of 
Kristof, Lord, JSreskog pr others. The f l4xibility , generality, ^nd ease 
of, calculation inake this technique potentially attractive. However, much 
pore needs to be known about its small sample behavior in comparison to 
the other methods. 

Fuller's Test - • 

In a significant contribution to the' weighted regtession method, . 
Fuller (1980; Waifr en, .White & Fuller, 1974) derived, a significance test 
for the smallest root of the determinantal equation <64) that is valld^ 
for. small satoples and, modified the equation to improve the^ efficiency of 
the estimators of the functional relation coefficients. The methods 
devised by Fuller may, be usfed with any number of variables; but, again 
for illustrative purposes we shall consider only two scales, jX and ,y, as 
i%the preceding' section. FuZLer assumes that the vector containing the 
errors oB. measurement are independently and identically distribute^ as a 
multivariate normal random variable with mean zero and c ©variance matrix 

A 

S . The matrix S is positive definite and a consistent 

— ee - — sy ^ * : . 

* estidmator of S • Finally,' an unbiased estimator, S , of 

— xy — ee . 

multiple of S is available. Fuller (1980) presents formulas for the 
- ee . 

case where S is knpwn to be diagonal (the measurement errors Bre« 

— ((Be ' ^ ■ . 



uncorrelated) arid for the general case when ^ posit iv 



T 



■ of 



ERIC 



semidefinite matrix. As described in Warren, White, and Fuller (1974) 

« 

th'e null hypotjies^s given by Equation (8), that the variance of^the 
stochastic error in the equation equals z(*,ro, can be tested for the 
special case of uncorrelated errors as. follows: 

2 

Under the stated condit:ions and the null hypothesis that - 0, 



the distriWti*on»o£ the small^t root of 

^ — XX 1 — ee 

r . 

can be appr^imated by Snedeqor's F 



(66) 



(67) 



with N-2 and d degrees of freedom where 



d - 



(N-1) g2 

X • 



(68) 



When the obtained P jsx^eds the critical value of F for N-2 and d degrees 
of freedom, the hypothesis of equivalence or. perfect linear relation is 
re^jected. 

The consistency and small- sample properties of Fuller's estimators 
also hold for those of Koopmans and Tintner under the same assumptions. 
Fuller's test statistic should be better behaved than Tintn^r's 
, approximations. HoweUr, the performance of these tests when the 
assumptions of normality and linearity are .violafeed is unknown. The 
power of the^wei'ghted regression^methods relative to that of Kristof's 
^nd J*l5re8k6g8 tests has not been determined either. > 

JlS^reskog ' s Model ^ 

^ The general form of JSreskog's method has been discussed extensively 
in preceding sections. Th'us,^, it will be considered only briefly here. 
In Jbreskog/a (1973)' LISREL (Linear Structural Relations) formulation it 



im possible to test the hypothesis that s^^ - 0 in terms of the 
difference between two model chi squares. The path .'model is illustrated 
in Figure 3. This test has all of the characteristics de8cribe4 for the 
ACOVS test above and likewise depends upon assumptions of multivariate 



Insert Figure 3 about here ^ 



normality and larg6 sample size (Jbreskog & Sbrbom, 1978). This LISREL 
test would be expected to perform very similarly to the ACOVS/COFAMM 
test. This is the final method which is based on information about the 

measurement error covariance structure to be considered. Now the only . 

m 

method to be considered which uses independent information about the 
-ii^eliabilltie^of x and y will be reviewed. 

ti . . 

Procedures Using Reliabilities 

Fuller-Hidiroglou Test • • 

Fuller and Hidiroglou (1978) developed a model which uses Information 
about the reliabilities of "x and y to estimate the measurement error 
variances. Once the error vari^e estimates are obtained, the 
hypothesis testing procedure c3/osely parallels Fuller's (1980) method ) 
that was^presented in an earl/er section. However, more stingent 
assumptiqns are- reqjtired in/the present case. In addition to assumptions 
about'the normality of tWS^ error distributions, Fuller and Hidrog^u 
(19.78) assume that the ^Irue scores, X and Y, are normal independent 
random variables with/mean zero. Although this assumption will not be 
tenable in some educational applications, it is not an unreasonable 
assumption for much of' the research on educational change. 



First, we define k ahd k t as the ratio of error variance to 

yy XX ^ jU 



total Variance 



k - 

yy 



s 2 . 

e e 

y y. 


and 








\ 


^ s , 


2 



/ 



(69y 



e e 

k - ^ ^ / (70) 



XX. 



S ^ 

XX 



/ . 

Th'e;i, the reliabilities of the observed vdriablfes may be written as 

r « 1-0 - k , and , (71) 

yy yy . 

r « 1.0 - k • . ^ (72) 

^X XX 

Given i ndependent estimates of the reliabilities, E^ations 71 and 72 can 

be used to estimate'k and k . Define K as a matrix of ^ order two 

yy XX — 

with k and k on the diagonal. Let D be a diagonal matrix with 

yy XX ' • , — . ^ 

the standard deviations of x and y on the principal^ diagonal. The 
smallest root of the determinant al equation 

0 . (75) 

may be used to test the *hl|^pothesis of equivalence. If u^^ is not 
significantly different from one, tfie hypothesis of equivalei^ is 



accepted. Since the limiting distribution of u^ is the unit normal, 
the quantity <^""''" N*^ (u^^ - X) may be compared xrith the tabular 
values of the unit normal distribution to test the hypothesis (Fuller & 
Hldiroglou, 1978, p. 104). 

This concludes, the review of methods for determining whether a linear 
functional relation exists or that measures ^re equivalent. It is. easy 



S ' - D K D 
— xy 1 



IV - 33 

• . 121 



to see how they could; be used to good advantage In many studies of 
educational change. The optimal method will be a function of the kinds 
of data availably and the properties of the tests and estimates. The 
resi^^lts obtained in this chapter .ought to assist researchers in Choosing 
the best stratistical LFR uiethod for their needs. 



122 



HEPERENCES FOR CHAPTER IV 

AdcfK^JftV R.Jf* A proxies in least squares* Analyst > 1878, 5^, 53-54. 

^ Algner, D*, & ^oldberger, A*S* (Eds*) Latent variables In soclo'-eeonomlc 
aodeis^ MAzQSterdam: North^Holland, 1976. 

.^.^tii^rlcan JPsychaloglcal Asisociation. Manual for educational and psycho^ 
log leal tests * Washingt on^ ' D • C • : Author f 1966 • 

Anderson^; X*W* The asj^taptotic distribution of the roots of certain 

determinant al equations* Journal' of The Royal Statistical Society, 
Series B> Supplement No# 1, 1948, 10 > 190 ff . ^ ^. 

Bejar» ,X*I« Biased assessment of program impact due to psychometric 
artifacts. Psychological Bulletin , 1980, 87, 513-524. 

Blalock, H.M* Multiple indicators and the causal approach to measurement 
error. American Journal of Sociology, 1969, 75^, 264-272. 

Block, J* The equivalei^e of measures and the correction for attenuation* 
Psychological Bulletin, 1963, 60 , 152-156* ' 

Bock, R.D. Multivariate statistical methods* 'in behavioral research . New 
York: McGraw-Hill, 1975. 

Campbell, D»T.;, & Piske, Convergent and-^aHbscrlminant validation by 

the multitreit-multimethod matrix. Psychological Bulletin, 1959, 56, 
81-105. 

Ctonbach, L.J. Test validation* In R.L. Thpmdike (E40» Educat^ipnal 
measurement (2nd ed«}» ..Washington, D*C.: American Council on 
Education, 19^71. * * . cs 

Cronbach, L*J», & Meehl, Construct validity In psycholc^lcal 

tests* Psychological Bulletin , 1955, 52*, 456-473* 

Curetot^ E*S« On certain estimated correlation functions and their 
standard errors* Journal of Elxperlciaflental Educarion > 1936, 4, 
252--264* . 

Davidon, W*C# Vatflable metr'Ac met^hod for mihlmizatlon * A.E*C. Resi^arch 
and Development Heport, ^^iM.-5990 (Rev.), 1959. 

fiunlvant, N« Review of procedurea jEor determining thp equivalence of 

measures* Paper presented at meetli^d of the American Educational ^ 
^ ReseQrcb Adsociatlont Saii Francisco, April 1979. 

• ■ - - ■ t ■ 

Bunlvanti N# A blhlidjjir^lQr, on statistical methada for determining the' 

eqiUi?valet)ce of tests and asaessment ptocedurea* Joutnal Supplement 

aiid AbstxaAt Service^ 1979# 



Fletcher, R., Powell, - M«J«D. A rapidly ' converging descent method for 
minimization. The Computer Journal , 1963, 6^, 163-168. 

Forsyth, R«A. i & .Feldt, L.S. An investigation of empirical sampling dis- 
tributions of correlation coefficients corr^ct^d for attenuation^ 
Eduoational and Paychqlogical Measurement , 1969, 29^, 61-72. 

Fon^th« R.A»y & Feldt, L.S# Some theoretical and empirical results 

related to IfoNemar's test that the population correlation coefficient 
corrected f or attenuation equals 1.0. American Educational Research . 
^ Journal , 1970, 2, 197-207. ^ 

Fuller, W»A. Properties of some estimators for the etrors-in-varlables 
model. Annals of Statistics and Probability , 1980, in press. 

Goldberger, A.S., & Duncan, CD. Structural equation models in the social 
sciences . Nev York: Semli^ar Press, 1973. 

Gulllksen, H. Theory of mental tests . New Jovki IJiley, 1950. 

Gulllksen, H. Methods for deters^nldg equivalence of measures. 
^ Psychological Bulletin, 1968, 70, 534-544. 

Hanushek, E.A., & Japkson, J.E. Statistical methods for social scien^ 
tl8ts % New York: Academic Press, 1977. % ' . ^ " 

Hsu, P.L. On the problem of rank and the limiting distribution of 
* Fisher's test function. Annals of Eugenics , 1941, U., 39ff.^ 

Heise, D.R. Separating reusability and stability in test-retest 
correlation. American Sociological Review , .1969, 34, 93-101. 

Hsu, T.C., & Sebatane, E.M. The effect of differences in covariate means 
amotug treatment groups upon the F-test of the analysis of 
covarlance. Paper presented at the Annual Meeting of the American 
Educational Research Association, New York, N.Y., April, 1977. 

IMSL, Inc. IMSL library reference manual . Vols. 1-3. Houston: Author, 
1979. 

Isaac, P.D. Linear regression, structui^al relations, and measurement 

error. Psychological Bulletin, 1970, 74, 213-218. 

Johnson, N.L. Systems of frequency curves generated by methods of 
•translation. Biometrika , 1949, 36, 149-176* ^ 

Johnston, J. Econometric methods. New York: McGraw-Hill,* 1963 • 



JQreskpg^ K.6. A general SUhod for estimating a linear stnx^tural 

equation ^stem. In A.Sv^Gdldberger & O.D. Duncan (Eds.),- Structural 
equation models In the social sciences . New York: Seminar Press, 
1973. 



-ly - 36 124 



JOreskog, K.G. A general method for analysis of covariance structures. 
Blometrika , 1970, 57, 239-251. 

Jare8k0g,'K.G., & StSrborn, D. LISREL.IV - A general computer proj^ram for 
e stimation of a llneag str uctural equation system by mfflclmum 
likelihood methods . Chicago; National Educational Resources, 1978. 

jareskog, K.G. Statistical analysis of sets of congeneric tests. 
Povchometrika ,- 1971» 36, 109-13-3. . 

jareskog; K.G. Analyzing, psychological «iata by structural analysis of 
covlriance matrices. In D.H. Krantz, R.C. Atkinson, R.D. Luce, & P. 
Supp4s (Eds..)* Contemporary developments in mathematical psychology 
(Vol. II). San Franciscor Freeman, 1974. ' 

J'(5re8kog,'K.G.,.& Goldberger, A.s'. Estipation of a model with i«ultiple 
indicators and Multiple causes of a single latent variable. Journal 
of the American Statistical Association , 1975, 70, 631-633. 

jareskog, K.G., Gruvaeus, G.T., & van Thillo, M. ^'S^f oMr/'prinfcenr' 
program for analysis of co vaflance .structures (RB 70-15)* Princeton, 
N.J.: Educational Testing Service, 1970.^ ,^ 

Kelley..T.L. Fundamentals of- statistics . Cambridge: Harvard University 
Press, 1947. , - 

Kendall, M.G., & Stuart, A. Ttie advanced theory of statistics. Vol. 2; 
Inference and relationship * London: Charles Griffin, 1961. / 

Koopmans, T.C. Linear regres sion analysis in economic time series . 
Haarlem: De erven F. Bohn N.V., 1937. 

Kristof , W. Testing a linear relation between true scores of two 
measures. Psvchometrika , 1973, 38, 101-111. 

Lee, S-Y., & Jennric^, R.I. A study of algorithms for covariance 

structuffe analysis with specific comparisons using factor analysis. 
Psvchometrika, 1979, 44, 99-113. 

Lord P.M. A significance test for the hypothesis that two variables 
measure the same trait .except for errofs ofi measurement. 
Psychomet-rika, 1957, 22, 207-220. ■ , 

Lord, P.M. Testing if two measuring procedures measure the same 
dimension. PsvcholoRlcal Bulletin , .1973, 79, 71-72. « ^ 

Lord, F.M., & Novick, ^.R. S^«tlstical theories of "^"^al test scores 
^ wlth contributions by' Allan BirnbaumJ V^ading , Mass.. 
iddi son-Wesley, 1968. ' ^ 

»«»^or,oir» " A The fittina of straight lines when both variables are 

S^ ■ -^r- - A--^-^" statistical Association. 

1959, 54; 173-205. 



IV - 37 



MaUnvand, E. Statistical methods of econometrics (2nd rev. ed.). New 
York: American Elsevier, 1970. (Tr. by Mrs. A. Silvey.i 

McNemar, Q. Attenuation and interaction. Psychometrika, 1958, 23, 
259-265. • 

Mendenhall, W., & sbheaffer, R.L. Mathematical statistics with applica- 
tions . North Sc it uat,e, Mass.: Duxbury Press, 1973. 

Moran, P.A.P. Estimating structural &nd functional reXationships. 

Joui^nal of Multivariate 'Analysis , 1971, 1^, 232-255. , 

Nes^selroade, J.R., & Reese, H.W. Life-span^de velopmental' psychology.; 
Methodological issues . New York: Acad^c Press, 1973. 

Nunnally, J.C. Psychometric tbeory (2nd ed.)i New York: McGraw-Hill, 



Rhodes, E.G. On Unes and planes of closest fit. Philosophical 
Magazine , Series 7, 1927, 2, 357 f f . 

Spearman, C. The proof and measuffement of association between two things. 
. American Journal of Psycho logy-k 1904, 15, 72-101. (a) 

Spearman, C. General intelligence, objectively determined and measured. 
American Journal of Psychology . 1904, 15, 201-293,. (b) 

Spearaan,"c. Demonstration and formulae for' true measurement of 

correlation. American Journal of Psychology , 190J, 18, 161-1&S. 

Spearman, C. Coefficient of correlation calculated from faulty data. 
' British Journal of Psychology , 1910^ 2» 271-295. 

Tintner, 'G. A note bn rank," multicollinWrity and multiple regressions. 
. Annals of Mathematical Statistics , 1945, 16, 304 ff . 

I 

Tintner, G. Multiple regression for systems of equations. Econometric^, 
1946, 14, n5 ff . , ' ^ 

Tinter G. A test for linear relations between weighted regression 

rnAfflcients^ Wnal of the Royal S tatistical Society, Series B, , 
1950, 12, 27flEf. ^ 

Villegas, C. Confidence region for a linear relation. Annals of ; 
Mathematical Statistics , 1964, 35, 780-788. 

Votaw, D.F., Jr. Testing compound symmetry in a normal multivariate | 

distribution.' Annals of Mathematical Statistics . 1948, 19» '♦'^^T^''* 

van Uven, M.^*" Adjustment of ~ N points (in n-dimensional space) to t^e 
best linear (n-D-dimensional space: I and II. Koninklijke Akafiemie 

' ^ van Wetenschapen te Am«^prrf«m.. Proceedings of the Section of Science, 
1930, 23, 143 ff ., 307 ff . 



Warren, R.D., White, J.K. , &*^Fyller, W.A»' An errors-ln-varlaTjles analysis 
of managerial role performance* Journal of the American Statistical 
Association, 1974^^9, 886-^893. 

Werts, C«E., & Linn, R«L. Corrections for attenuation. Educational 
: Psychological Measurement , 19f72, 32^, 117-127. 

Werts, C.E., Linn, R.L«,'& Jbreskog, K»G. Another pe^r spec tlve on "Linear 
regression, ^structural relations, and measurement error.*' 
Sducatlonal agd Psychological Measurement , 1973, 313, 327-332. 

I 

Wllks,^ S.S. Sample cTlterla fot testing equality of means, equality of 
. variance and equality of covarlances in ^ normal multivariate 
distribution. Annals of Mathematical Statistics , 1946, 17, 257-281. 

l^onnacdtt, T.H., & Wonnacott, R.J. Introductory statistics (2nd ed.). 
New York! Wiley, 1972. 



IV -39 : 127 




FOOTNOTES 

^In this chapter a prime Indicates vector and matrix transposition. 

'^The determinant of a 1x2 matrix Is equal to the product of the 
dlagnoal' element 8 minus the product of the of f -dlagnoal elements. 

^ -^In- this quot^lon and all other^cited, symbols have been changed 
to conform to the notational conventions used In this papei;. 

*The most tecent version of Jbreskog's pr'og'ram for the analysis of 
covarlance structures Confirmatory Factor ^alysls with Model Modification 
(COFAMM) is marketed through International Educational Resources, Inc., 
^Box A^3650, Chicago, Illinois 60690. 






- ■ ■ . .y...... . ... 

^ ' ^ Table 1 






4^ 






Comparison of Test Score Models" 


V 








Test Score 
Model 


Propensity Distributions Experimental Linear 
First Two .Moments Higher Moments Independence Experimental 

Independence 


True^ 
Scores 


Error Observed 
Variance^ Means 


Intercor- 

relations Validities 


Strictly 
Equivalent 


— 'I _ - , - - • .■ > 

Equal Equal Yes Yes 


Equal 


' Equal 


Equal 


Equal Equal 


Parallel 


Equal Unequal . No Yes 


Equal 


Equal - 


Equal 


Equal Equal 


T- equivalent 


Unequal \ ^Unequal No ' Yes 


Equal- 


Unequal 


Equal 


Unequal Unequal 


Essentially ^ 












T- equivalent 


Unequal a • Unequfil ' No Yes 


Unequal 


^ Unequal 


Unequal 


Unequal Unequal, 


Congeneric 

< 


Unequal Unequal No Yes 

•I 

i 


Uri|squal 


^ Unequal 


Unequal 


Unequal Unequal 

i 

■i" 1 ■ ■ 


See Loi^d and Novick (1968> Ch. 2) for more information 










! rue scores 


ifiay di^ffer only by an additive co^itant 










. ./' C 

True bcores 


may tliffer only by an additive constant ^nd a scaling factor 










h' . '123' 










• • 


; -ERIC 


1 

• t 




130 









; ■ 





Table 2 






/ 


f 


* ■■ 


/ 






Score Schema 
Test 


• 




« 

• 


* 




Individual 


Replication 
(1) (2) 


Sum 


y 

Replication 
(3) (4) 


Sum 

. J 




» ; 




'^ll 


'^iz 


XI. • 


yi3 


yi4 




• 




2 


'^21 


^^22 




y23 


^24 


y2- 






• 

• 












** 






;-:v _ i 


Hi 






yi3 


yi4 


yi. 


• 


■ v^ff 


• 








< 






















yN. 






/ Mean 


ST* 

^•1 












t 




- a 

Adapted 


from McNeniar 


(1958, 


p. 259). 


- 














1 


































• - 














■ ■ ■■*V''"i 








/ 

/ 






















; . 








• 




/ 


> 




H 




















« 








• 


# t 




<• 












. IV - 


42 





























Table 3 ' ' 

Comparl,son of Assumptions' and Hypotheses of Eight Methods for Detennining Equivalence- 



r 



Assumption 



J 



Large sample test ^ 
xj^ and X2 same origins 

Xj^ and %2 measurement ^ 

x^j^'and X2 same std error of meas;arement 
73 and same origins 
yj and y^ same^nits of measurement 
y^ and y^ same std error of measurement 
Xj^, X2, y4 same origins 

^3 ^4 ^^^^ units of measurement 
^j^, Xj, yj and y^ same std error of measurement 
X and y same reliabilities 

r 

- 1.0 

if (e^ej) - 0, f(e3e4) » 0 
^ (ejX) - g(ejY) = 0 

^(62^63) - ^(6164) - ^(6263) - ^(6264) - 0 

*2» ^3 and multivariate normal 
*1» *3 ^'"'^ ®4 no'^^l^y distiributed 
*1«3» *1*4» *2*3 «nde2e4 bivafiate normal 
•2^63' and 62^64 same joint distributions. 
«2«3 e2e4'sane joint distribution'^ 
Easily generalized to three or morjTs tests (x, y. 





4 


Test* 










ildt- 


ildt 




» 
« 

60 
«l 






u 


1 McNemar 
Forsyth- Ff 
McNemar 


h 




ft 




Jtfreskog 

f 


0 

> 

1 

60 


Forsyth- 


Lord 


Lord-Vi] 


Kristtff 


,/ 

/ 

V 

/ 


■ N N 


Y 






N 


Y 




Y Y 


y' 


Y 


Y 


N 


N/H ' 


H 


Y Y, 


Y 


Y 


Y 


N . 


N/H 


. H 


Y Y 


Y 


Y 


Y 


N ' 


N/H' 


H . 


Y Y ^ 


Y 


Y 


Y 


N 


NAH 


H 


Y Y 

1 


Y 


Y 


' Y 


N 


N/H 


H • 


1 

Y Y 


Y 


Y 


Y 


N 


N/H 


H 


Y Y 


N 


N 


N 


N 


N/H 


H 


Y Y 


N 


N 


N 


N 


N/H 


H 


Y N 


N 


N 


V 

N- 


N 


N/H 


n 


Y N 


N 


N 


' N 


" N 


K/H 


H 


' H H 


•H 


H 


H 


H 


H 


Y 


Y Y 


Y 


Y 


V 

I 


Y 


Y/H 


Y 


Y Y 


Y 


Y 


Y 


Y 


Y/H 


Y 


Y Y 


Y 


,Y 


N 


N 


Y/H 


Y 


Y Y 


Y 


Y 


X 


N 


Y* 


Y 


IT N 


N 


Y 


N 


N 


Y 


N 


Y , Y 




N 


Y 


Y 


- N 


N 


N N 


N 


N 


Y 


Y 


N 


N 


- N 


N 


N 


V 


• N 


N 


N 


N N 


N 


N 


Y 


H 
N 


N 


Y 


Y N/ 






N 


Y 


N 


N ' N 


Y 


'y 


\n 


N 


N 



2 f • • • ) 

« atfdiry bivariate normal distriburtion! 

' *Ih« letters make the following designations : Y - Yes, the assumption 
. - th« assumption is* nqt required; H - Hypothesis, the assumption 
^j^^4t Mil hypc^eht,ls. y IV - 43 132 ' 



.s required; 
is tesf-ftd as 



■ - \ 



'4 



Restricted Model 



M2 



'4, 



Tabie- 

Ifests of Equivafgnce Using ACOVS 



.Number of 
Parameters 

4 

> 5 
8 
, 9 







10 


% 




4 






- 8 . 






• 2 


«4 




. 1 



Full Model 
Ms 

— Not tested-- 
«2 

I 

Mo 



Nunber .of 
Parameters 



10 

10 

io 

10 



.1^ 



5 • , 
9 
6 
5 



6 
5 
2 

' 1 



r 



fc£RIC 



IV - 44 



.133 



, Figure 1 ' . 

V 

Path Model with TWp Sets of Congeneric Tests 




Adapted from Jdreskog (1974), Figure 3. 



Figure 2 



Path Model with Four Congeneric Tests 




^Adap-ted from Jtireskog (1974), Figure 1. 

Figure 3 \ 
Path Model for Correlation Corrected for Attenuatiori 



IV - 45, 



134 



CHAPTER V 

SOME- ANALYTIC RESULTS FOR 'PARAMETERS" AFFEQTING BIAS IN 
GOODNESS OF ¥lT AND SAMPLING DISTRIBUTION STATISTICS 

(1 

INTRODUCTION 

The developmjeats in the p'receediug, chapters suggest that the 
parameters of . the observed-score distributions are functions of the' 
parameters of the latent-variable distributions. Ihls is indeed the 
case. We can write j^pressions fdf b^' , R , and s^, in terms tff 

h and the population variances and coVariances of the latent (true and 

^ f ^ ■ \ 

eMor) variables. In addition, for a fixed preselected sample size, (N) 
the ^pected values of £ and £ ^^^^ > the covariance matrices of b^ 
,and h\ can be d^ived in terms of the structural parameters. We present 
these results in this chapter and compare r2' with r2, Sgi^ with s^^^ and 

— b'b' ^ bb* '^^^ comparisons enable us to draw conclusions^ 

concerning the parameters affecting bias in the observed-score 

statistics. We describe the kinds of data and ^conditions which are 

likely to lead to incorrect inferences concerning the determinants of 
* 

trye change from observed-score regressions. , 

These results mean that if an investigator had hypotheses or 
knowledge about the structural parameters, then he or she could determine 
the corresponding parameter 'Vf'alued for the observed-score population. By 
comparing these two sets of parameters the researcher could ascexftain the 
degrete to iidiich inferences about true change based on analyses of (even 
^ery large) samples of observed scores could be expected to be 
incorrect. However, mojst investigators are not able to state a priori 
the population parameter values of the true and error distribution 



f / 



* ■ ■ ' 

because of a lack of, pxevious research or because the mathematical 
' formalization of "the verbal theory" can not be accomplished precisely. 
Even though the exact values of the latept variable parameters are not ' 
availake in most circumstances, a range of like^ly *>r tljeoretically 
poss^bleValues usually can be prespecif ied. -For these cases, sets of • 
possible latent structure parameter values could be used to generate sets 
<^f possible obsetved-score outcomes. Jhese ctJuld be evaluated and the 
potential for errors of -inferences due to errors of measurementf 
assessed. In the next chapter we use the results of tliis chapier to 
devise an algorithm which takes as input the parameter values |)f the 
Structural relations aaong the latent variates as specified by the 
researqher and outputi the expected values of. the observed-score 
regression parameters for a given sample size. 

EXPRESSIONS ?0R TEE TRUE- SCORE PARAMETERS 
Before the expressions for the observed- score parameters can be 
written in terms, of the latent-vy^le parameters,' it is necessary to 
derive the covariance structure of the latent variables. Flrse-, recall 
the .single-equation structural model specifying- the true postteSt <Y) as 
a function of the structural regression <;o6ff Icients (b^V b), the vector 
.of true causal variables (X), and- the stochastic error component ..(e) ^ 
given in Chapter IX as . • 

• . Y - bo + i*^X + « ' . ' . ^"""^ 

(where the superscyipt t represents vector or matrix transposition). The 
eqiiatioba the simpUf ied measurement model are also reproduced for the 
reader's convenience: 



X X + u_ 
y - Y + V 



(2) 
(3) 



Row the covarlance structure of the latent variables can be given. 
For the true scores we have the covariance matrix of the 



s 



and the covariances of the true X^ with the true Y 



(symmetric) 



(4) 



J: YX 



SYX2 

r 



" I xx b 



(5) 



The vector of structural regression coeff ic^^ts can also be written 
as a linear function of the true variance and covarianes as demonstrated 
, in Chapter 115 - * 

. ^.J.^': ' ' ^ (6) 



and then the intercept coefficient is 
bo ? - bf I. 



<7) 



where the bars designate means or e?ipected values. Since Y is a weighted 
linear function of X and e, the variance of Y can be expressed as a 
weighted Unear Vofflbination of the variances and covariances of the X^ 
and ^, »diere m make explicit the usual assumpt^.on that E(Xe) « £5 

8y^ ■ YX + %^ - XX i + «e^ • 

1 

/ 



137 



Equations 4 through 8 m£^ be summarized in the form of the 

"v J 

parcitloned covariance matrix for Y, X as follows: 



YX 



2 ' 

> ; 




t 

Yi; 


m 




• - YX 




XX 




• -^-XX-^ 



XX 



Y - bo + b^X 



8^ for True Change 
The first index of the magnitude of the systematic relation between Y 

and X, the square of ^the standard error of estimate (s^^), can be 

expressed in terms of the structural recession coefficients and the true 

variances and covariances* If we let Y represent the systematic or 

predictable part of Y, then 

^(10) 

Note that the structural equation model adopted 4^n this paper (Equation 
1) specified that Y Y in the population. Thus, e,^ defihed as 

e - 'y - (bo + b^X) - Y - Y , (11) 

can be taken as a stochastic component <rep resenting the fact that the 
response process or response generating mechanism is probabilistic in 
natture* Alternatively e can be conceived as a lack of complete model 
specification as follows. We take e to be .a linear combination of 
additional predictors of Y ^' * 

e - Xfcfi H- ••. 4- Xp ^ ^ , (12) 

where regression weights are ignored and impose the restriction that 

1 • • * k, j • k + 1 • . p. Then e will 



E(X^Xj) « 0 f or 1 



138 



- 4 



function as a random variable In the structural model. The variance of e 
(the square of the standard error of estimate) Is 

• ' . * V 

- E(ee) - E[ [ Y - (bo + btX) | t - (bo + btX)| ] . (13) 

Evaluating the right-hand member leads to an expression in which s 

■ •■ e . 

is given as a variance of a difference in terms of Its components: , 

8^2 - SY^ +^b5S XX b - 2bt£ YX • (1^) 

sf 

2 ^ 
R for True Change 

The second Index of the de^ee of systematic relation between Y and X 

Is the coefficient of multiple determination- or squared multiple 

correlation. It Is defined as the ratio of explained variance to total 

variance: i 

^2 I'l XX b . 
^ 5 — — 

8y^ 



/ ^bb True Change 

For a fixed sample size the sampling variability of the regression 

coefficients can be derived for the general case: 

t ' ■ 

1 b b - J_ S XX (16) 

~ ~ N 

The right member in this equation ^contains Information about the variance 
structure of Y and X. Having derived^ a set of equations which involve 
parameters that apply to true-score regression, wa can now focus upon the 
errors of measurement* 

The variance structure of the predictor measurement errors will be 
denoted as 



V - 5 



139 



5. uu 



^ B 




(17) 



The varifi^nce of the dep§!ndent variable etxor of measurement is s. 

and the CO variance of v» li is • 

/ ■ '/ ■ .' 

£ vjif " ' ( ®vui ^\X2 • • • ^vu^^ • 



The results ar^ amalgamated i^to a partitioned matrix, 

2 



1. vu 



»UkV 



"1 



. (synnnetric) 



(18) 



(17) 



EXPRESSIONS FOR THE OBSERVED-SCORE PARAMETERS 

In deriving. the covariance structure of the observed variables it is 

necessary- ^to impose certain restrictions usually associated with 

classical test theory (Lord & Novick, 1968), viz., that' the errors of 

measurement are uncorrelated with the true scores, tliat the expected 

values of the measurement errors are identically zero, and (as a 

consequence of the preceding) that the expected values of the observed 

c 

variable^ equal the expected valufes of the corresponding true scores. 
Symbolically we write 



1 



' E(u Y) - E(X v) - E(^jf ) r 0, and E(Yv) - 0 



(20) 



E(ii) • £ and E(v) - 0, and \ (24) 
E(x) - E(X) and E(y) - E(Y) • (22) 

The reader should note that the errors of 'measurement are permitted 



to be correlat<&d» e.g.» E(ju y) ^ £, since iii many analyses of change it 
is (^ulte reasonable .to expect pretest and posttest errors to be 
correlated,' Now we present the partitioned covariance matrix of the 
observed y and x' values: 



S 

-yx 



yx 



yx 



XX 



(23) 



Since the true sqotes and errors of measurement are uncorrelated S 

, -yx 

is^the sum of andl\S , and using previous results the 

^ IT 

covariance structure of the observed' scores can be written strictly in 

■ -- ' *A ' 

i parame 



terms of ^ the structural^ p 

2 



S 

- yx 



s + 

y 



aeters! 
2 

s . 



— YX — vu 



( s 



YX 



+ 8 

— VU 



S ™ + S. 
— XX — uu 



^1 XX b + s^Z + 



S. XX i £ 



VU 



b?= S XX + £ vu*^ 



XX 



+ uu 



(24) 



The vector of observed- score regression coefficients defined as' 



b' - S s. xy 



(25) 



can be expressed in terms of the structural parameters: 
b' - (ixx+ i uu)"-"- (£ YX + 8^ vu) 

If we let 

Lr « XS XX +. S ,iurl S XX. and m - (S xx + 1 uu)^''- £ vu ' (27) 

then 

b' - Lb + m . (28) 
ThuS:, the vector of observed- score regression coefficients Is seen to be a 
weighted lljaear combination of the true-score regression vector and the 
true, and error covariances. The observed Intercept then becomes 

b'o - bo + (b - b*) X . (29) ' 

^®e* Observed Change 

The goodness of flit; Indices of the observed-score regression can also be 
written in terms of the' stnictural parameters. ■ The first Index of fit, the 
Tvarlance of ~the observed resldua-1, can be derived as^ 

8e»2 - E[ [ y-(b'j) + b'tx)] t |y^(b*o + b*x)f ] 

- E[ [ (Y + V) - (b*o^+ b't (X + u))3 t ^(Y + V) - (b'o + b't(X + u))] ] 

- 8y2 + + b'ts XX b'^ s uu b'^" 2b*t ( s^ yX + b vu) • (3°) 

2 

Equaf ,>n 28 can_be employed to express £ as d function of the 
parameters of the latent variable 'distributions: 

ag,2 . + 8v2_+ + m)^ S_ xxCL. k + 3) + (kl + ™)*^ 1 uu <!l 1 + IS.) 

-2[(b + m)t (sYXSvu)] • (31) 



V - 8 

142 



(32) 



R^' for Observed Change 
Second, the coefficient of multiple determination is given by- 
R2 Ixx b' /8y2 - b't (ixX + S uu)b' /(8y2 + 8v2). 

Using equation 28 the coefficient of multiple determination of the 

obsefved variables may be written exclusiy^ly in terms of the latent p 

variable parameters as 

r2 m ( bt XX L b + ^Ji-S uu Ii b + mpi XX L b + m^S^ uu i. b ; 

+ XX E i uu E + E^ i .XX E + m'^ i uu E) / £ y^) • (^3) ^ 

S , , for Observed Change 
Information about the Joint sampling distribution of the 
observed-score regression coefficients is contained in \ 



' - 8,e.2 . 1 . (S XX+ S uu)"^ <34) 



S b'b 

N 



i 

Clearly Equation 31 can be used to write £ as a function of the 

latent variable parameters: 

S b'b- . + 8^2 + (L b + m)t S xx (k^+m) + (L b + m)t S (L b + m) 

-2 [(b + m)t (8_YX Svu)] • (S. XX + i uu)"^ • 

PARAMETERS AFFECTING BIAS IN OBSERVED-SCORE REGRESSION STATISTICS 
Thus far, expressions for the true-score regression parameters <b, 

Se^, r2, and £ bb) and th4 ob^erved-scSre regression parameters (^', Sgt^, 

• 1 

R^'. and S ,) have been derived exclusively in terms of the 

— b D , « . 

parfflaeters of the joiiat distributions of the true scares, X and Y. In 
the followipg sections, we compare the parametric expressions for pairs of 



7 - 9 



143 



corresponding true-score and observed-score regression statistics. This 

pr^ess enables us to state some new analxtlc results demonstrating how 

the bias the in observed-score regression estimators Is affected by the 

distributions of the true and error components of the observed scores. 

many cased, however, simple general statements cannot be made id.thout 

making strong, assumptions because of the mathematical complexities. Even 

the general expressions provide insights into the biasing effects of 

errots of measurement and enable investigators to estimate .'a priori the 

» 

degree of bias that is likely to be found in most studies of change. 



2 ' 

Parameters Affecting Bias in s^, 

2 2 
To proceed, Equation 14 for s^ and Equation 31 for s^,' are 

segregated into three corresponding units based on their comparable 

' ' ^ 2 - 

structure. Ihese^are labeled A, B an4 C for s ajad A', B*, and C* 

2 ^ r 

for ^e' • 

r 

Se^ Se»2 



A: sy^ A' : sy^ + s ^ 



V 



Bs Jb^l XX b B' : (Jt bi + m)^ 1 XX (k 1 + HL^ (II + £ uu (L ^ + 
G: •iJb^l/YX '-2[(L4 lb + m)t (js yX + £ vu) ♦ 

where from Equation 27 



II " ^ XX + £ uu)" £ XX H " ( XX +1 uu)*" £ vu 

It can be seen Immediately by^comparing A and A* that the 

obsetved-score residual will exceed the true-score residuial when s 

• V . 

s ■ * 

is greater than z^ro* The discrepancy will increase as the magnitude of 



2' ' ' ' 

increases. Since power, or one minus the probability of Type .II 

error, is an inverse function of s , , it is clear that measurement 

. * . ^' 

error in the criterion rsdtK:es the pox^r 6t obsSrved'-score y±& a vis 

true-score regression tests. , ^, ' , / . 

As s Increases, C* decreases and B* increases relative to C and 

vu . - • ; '■■ ■ , • V 

B (holding all other terms constant)* The effett of s on the 

, "~ vu . 

^observed residual depends upon the magnitude of ^ ^ and relative 

to 8 general, positive covariances amojig the criterion and 

■ ■ , 2 
predictor measurement errors will reduce the bias in s^, f as an 

2 ^ 
estimator of • Negative covariances y however, will tjBAd to, 

I 

Increase the bias. It seems impossible ^to make ^ a general statement about 

' ' 2 2 * 

the absolute difference between s , and s as a function of s _ • 
^ e . e - — vu 

The actual* degree of bias will vary with the. siS:e and pattern of 
IncorTelation amopg-^the X' and the It gan be concluded, however, that 
in general bias will increase as £ ^ decreases. This is prob,ably a 
fortunate result for "investigations of change, because error covariances 
among pre- and post test measurements will be positive in most 
oircumstances. . 

2 - ^ ■ ■ 

The effect of S on s / is difficult to asaess since it ^ 
— uu e' 

appears in L^, m, and B' . When is diagonal and s^ " £» as 

N ' 2 ' ' - 

measurement error variances increase, the larger s^, will be 

2 ' 2 

relative to s^ . The effject of on the bias in s^, can 

not be ascertained for the general situation in which the errors of 

measurement may be positively or negatively interrelated. 

■ ■ 2 
Finally, 'evaluation of B' and C* reveals that thp bias in s^, 

.will be reduced fits ^ dominates £ in Ad S^^^^ approaches 



2 2 V- 

S^^ la value, approaches •yHowever, the pattern of 

relations, among the ^ and among the jXcan nullify this* 

In suEsmary, the bias In s ?' i^ll iWreass* as a function of^ ^he 

variance of the measurement .errors in the^de^^endent and independent 

variables and the covariances of th^ measurei^ent errors in the predictor 

varfi^bles* Negatively correlated criterion an)(^ predictor measurement 

errors tdnd to increase the bias* In most analyses of educational 

change, measurement error will reduce the power b£ statistical tests, 

decrease the .precision of parameter estimation, and increase the 

probability of inferential errors of the second kind. 

2* 

Parameters Affecting Bias in R 



. Reference to E^ua^ions 15 and ^3 Indicate that the^f ollowing 
s.egmentation can be made: 

r2 r2' 



A: 1/(sy2) A': 1/<sy2 + Sy^) 

B: b^S^xxl b^ll^ XX i. ^ l^il 1 uu L. b 

XX Ii H^i. uu li. i 
+ b^L^ XX £ + b^Il^l uu E 

SL*"^ XX 2L uu H 

Inspection of A* indicates that errors of measurement in the 

criterion variable negatively^l>£as the estimation of the squared multipl 

correlation* As the unreliability of y Increases, the bias 

V / 

(underestimation) of R grqfws* Thus, both goodness of fit parameters 



V - 12 . 

U3 



2 2* 

(s^f «^and R ) are negatively biased by errors of measurement 1 
th^ criterion. 

The effect of on bias In the squared multiple correlation Is 

similar "to Its effectr on the regression residual as demonstrated In the 

preceeding section. Negative covarlances. among the criterion and 

" 2' ' 

predictor measurement errors will Increase the bias In R • Positive 

covarlances will tend to decrease the bias. The actual amount of bias 
2*2 

(« R - R ) is a complex function ofS^ and as well as £ 

General statements do not appear possible. 

> 2' 
Measurement errors in the predictors affect the bias in R . in a 

complex way# The role of in L increases bi^s as long as the 



-covariance^^ are positive. On the other hand, the separate terms 

I. 

involving S_ in B' tend to decrease bias when the error covarlances 
ar4i greater than zero. The total effect of on bias will depend, 

th^^Af^^re, on the actual values* of £ and Working through a 

series of examples indicates that in most analyses of educational change, 
the overall effect of predictor measurement errors will be to Increase 
the bias in the estimate of the squared multiple correlation. This 
ass^limes that most error covarlances are positive and small in size"^ The 
degree of. bias decreaa^Sf as dominates In conclusion, 

observed"'8Core regressio^, analyses of change on the average will 
underestimate the goodness of fit of theyfSodel in most applications. "On 
the average" does not mean "always" so/invepstigators should be cautious 
in assimlng that the squared Imultiplf correlation estimate has been 
attenuated. 



/ 



147 



Parameters Affecting Bias in 
Reference to Equatioiis 16 and 34 reveals that the following 
structural oomparisonp can be made for S_ ^.^ atid 
; 'Sbb ' ib'b' 



A: Se^ A': Sg'^ 



B: (i/N)S^-l B': (1/N)(S xx + i uu)"^ 

2 

: A and A' indicate that the factors which affect s^, will 

4^f^..^r..n^ R In the same wavs. Thus, the estimates of the standard 

— — b'b' — - - - 

errors will be inflated by posttest measurement errors and^negative 
criterion-predictor measurement error covariances. The effect of £ 
is difficult to assess for the general tase. The elements of 1 b' b' ■ 
tend, to .increase as the variances and covar^'ances if S. increase. It 
is the- pattern of elements in S, however, which determines the 
extent of^bi^s^n S generally. 

.The effect of the patterns of interrelations among th6 true and error 
components on bias in S is most apparent in segments B and B'. 

for the situation in which S is diagonal and small relative to S 
predictor errors of measurement make the sampling distribution estimatfes 
too large. When S_ is nondiagonal containing both positive and 
negative covariances which approximate the elements of ^ in value, a 
general result concerning the bias in S cannot be derived. In 

conclusion, measurement errors tend to make estimates of the regression 

coefficients less precise than they would be if perfectly reliable 

— . 2 

variables were used. The degree of bias is a joint function of , 

.8 S . and S ^. Statements that apply across all conditions 
^ uv* — uu — XX . . 

and patterns of re^Lationshp can not be made, however. 



V MA,, 148 



CONCLUSION 

1 



In this chapter general matrix exprecsions for the true^-score 
.regression parameters have, been given. The observed-?scorfe regression 
parameters were expressed a> f uiictions. of the trUe-^ore regression 
parameters and the true and error covarlance structures. The parameters 
affecting bias in the obsetved-score regression statistics were evaluated 

by comparing the egressions Tor^he observed- and true- score 

I 2 „• J 

coefficients. Specifically, the biasing effects of , a; S ana 

S jjj^ on Sg,^, > and S. were explored. Some unequivocal 

statements could be made, e.g., ^ias^increases in all observed- score 
estimators as a direct function of s^. By' making strong assumptions 
about the error structure, viz. that s equals £ and S^ is 
diagonal, other genferal statements could be made, e.g., bias increases as 
S increases.' However, it was not possible to draw unqualified 
general concli\sions about the parametric determinants of bias. Much 
insight into th^e biasing effects of measurement error has been gained by 
ex'amlnation of .the expressions derived in th^ chapter. In the next 
chapter these formulas will permit development of an algorithm that can ' - 
be used. in studies of change to assess the pot4ntial bias caused hy the > 
unreliability of measures. 



J 

143. » 



' CHAPTER VI 

• AN ALGOBITHM FOR ^SESSIRg BIAS IN FLAMED 
. V ^ STj g^IES OF CHANGE 

INTRODUCTION 

^ The purpose. o£ thi# t:hapter is to develop a method for Investigators 
to easily assess' the possible impact of measurement, erx^or on statistical 
analyses of change* Using the iresult^ of l;he preceeding chapters^ 
especially those of Chapter V, an algt^^rlthm is developed which flakes as 
input estimates of the pa^rameter valufts of the structural relations among 
the latent variablesf (which the investigator thinks are clo§e to the true 
values a priori ) and outputi the expected values of the corresponding 
observed-score regression parameters for a prespecified sample size. The 
logic of the algorithm is explained and illustrated with a simple example 
of the effects of external Ioqus of control orientation on t^h^nge in V 

. scienbe achievement* 

As part of this research program, thq algorithm was implemented in 

'the form of a FORTRajjI computer program, ^hich can be easily installed in 

most software '^braries« The program enables researchers to input a 

... ■ # ^. 

series of estimates of the tKue-*score parameter values and. obtain 

.- ' ' • ^ 

expected values of the correspottding observed- score regressions* In the 

^» , ^ . 

final: section of the chapter » a ttbmprehenslve application of the computer 
program io presented* 

JJ6e of the program Will enable investigators to become aware of the 

ways/ in wfaldh measurement error may bias regression «analy sea of chtitnge. 

I ^ . 

Makii^ this evaluation ]^ ore data collection is completely analogous to 



carxyix^ out a power anajysls. The results of the assessment may leail 
the investigator to modify data collection plans.*. For esfample, the 
program msff reveal" that the reliability of the pretest must be Increased 
if . «ccur^e Inference* are^ t<> be possible^ The assessment may Indicate 
that bias can not be avoided easily and proispt the investigator to gather 
the data in such a way as tb make the use of attenuatiod-correction 
aethods or multiple indicator (LISREL) models possible. Also, as with' 
power analysis, the program can be used post hoc to determine the degree 
of caution one should have When interpreting the results- pf the 
regression analyses of observed scores. In many' situations, like the one 
desiribed in the example in this chapter, it will be concluded that the 
possible bias txx the obsexreed-score regression estimators was so great 
that any Infereoces must be regarded as completely suspect. 

THE ALGORITHM 

The algx>rithm requires inf ormatloij (^ibout^^ structural parjpc^ters 

and the variance^ structure of the true^^i^N^ component 6 as ^ut* 

Specifically, hypothesised or likelj^ijalues of s\^> S . s ^,8 

\ — XX' — uu* ^ Y ' vu* 

2 V \ 

\ » Jb ar^ necessary. When in tbia chapter, two simplifying 
assumptions are made, viz., that S is diagonal and a « 0 
information about the reliabilities of the observed predictor variables 
can be used instead of error variances, Tb- algorithm, however, is 
developed in a general form that will accomodate any measurement error 
variance structure. In the followii^ we let q e^al the number of 
predictor variables in, the model plus one* 



VI-2 



The aigorithm first ciomputes the important true-score regression 
statistics, R^, 8g^, and S as follows: 

- (btS XX b) / 



(1) 

(2) 



Se^ - tN/(N-q^] sy^ (1-r2) ^ 
1 bb - , IN/(N-q)] S xx"^ , (3) 



\ lYX-lxxb • (4) 

If one had information about s ^ ln&t«adS)f b, Equation 14 could be 
solved first and then Equations 1-3. These equations have been derived 
in Bore coinpllcated forms in the preceeding chapters. For ease of 

^ application, they are presented in their simplest or mos^ easily 

calculable form here. The trtests associated' with the hypothesis that 
the regression coefficient equals zero in the population are determined 
next: 

9 0 

^bi - bi / , ti,2 - b2 / Sb2 , . . . , (5) 

where the probability associated with ea^h t is a function of N-q 
degrees of freedom. , ' 

In the next stage of the algorithm, the variance structui^ of the 
observed x and y scores are derived: 

ixx - S XX + i uu , , - " (6) 

£ yx " 1 YX + £ vu , (7) 

; 8y2 - sy2 4-8^2 . (8) 

I 

Standard regression formulas are then applied to the observed-score- 
covarlance matrix; to derive estimates of the observed regression 



ERIC ■ : . 152 



parameters* Ob6erved«*8core parameters, corresponding to the true-score 
par^eters glven.^^]^ Equations 1-3 and 5*, are found as -follows; 
..r2^> (b'tS^i') / 8y2 



- [N/(N-q)] 8y2 (1-r2') ^ ! 

S b'b' - Se'^ [l/(N-q)] S 30,-1 

tb'i - b'l / Sb'j^ , tb«2 - b'2 / 8bt2 , 



(9) 

(-10) 

(IX) 
(12) 



The estimates and significance test results for, the true-score and 
observed-score distributions can be compared with ease and the potential 
for bias and incorrect inferences Assessed. 



I Bn 



AN EXAMPLE 



To illustrate the value fend use of the algorithm a brief example is 
presented. > Consider an investigation which seeks to test the hypothesis 
that external locus of control orientation exerts a negative effect on 
true change In science achievement* The variables are posttest scleme 
aithl^vement (Y), pretest scl'^nce achievement (X^^), and external locus 
of control (Xg)* It Is assume^ that the correlation between true 
pretest achievement and true external locus of control Is .74, and that 
the effect of locus of cpntrol on t^e change Is -.230. The 
reliabilities of the science pretest and external locus of control scales 
are assusoaed to be .769 and .951, respectively • Complete Input ^ 
information Includes: 



\ 



S XX - 



1.00 
.74 



.74 
1.00 



' \ 



\ 



ERIC 



VI-4 



153 



J. uu 



.300 
.000 



.000 
.051^ 



JL vu 



.000 
.000 



b - 



1.028 
-.230 



2 - 1.0, N - 200, and q - 3 



The true-score regression parameters are found using Equations 1 
through 5: 

- .860 

2 ' 
8 - .141 

e 



bb 



,002 -.001 
-.001 .002 



1 YX ■ 



.86 
.53 



tbj - 23.00 , tb2 - -5.15 . 

Jifith a sample of 200 observations the regression coefficients would be 

extremely well estimated. It is clear that a substantial portion of the 

true posttest achievement variance can be explained on the basis of the 

2 

true pretest and locus of control (R <■ .86). The inference that the 



VI-5 



154 



effect of ejpternal locus of control is negative would be strongly 
, Supported (lit, (197) - -5.15, p <.001]. 



Having derived the true-score regres8lon~coeff±c±ent-s--for^rhe 
hypothesized true-score distribution. Equations 9-12 are used to obtain 
the corrfesponding^bserved-score coefficients. First, the observed 
correlation matrix is calculated 



XX 



1.00 
.60 



.60 
1.00 



and then the observed predictor-criterion covariance vector: 



yx 



.70 
.50 



The tligression estimates are then easily computed! 



.63 
.13 



2' 



.50 
.50 



s b'b' - 



.004 
-.002 



'r.002 
.004 



ERIC 



t^,^ - 9.-96, tbtj - 2.06 . ■ 

The observed-score regression weight for the pretest is attenuated, 
as would be expected from the unreliability of the science pretest. Much 
to our hypothetical investigator's chagrin,' the observed-score regression 
estimate of the effect of es^ternal locus of control is (significantly) 
positive! Thus, ^.nferences about the effects of external" l<>c us of 
control on change in science achievement would be. completely erroneous if 
the fallibility of the measures Vas not tecbgnlzed. Use of the 
algorithm,' however, ^ alerted our hypothetical researcher to the potential 
danger, thus enabling him or her £o take corrective actions prior to data 
collection or to be appropriately c^Ciitious in interpreting the results if 
the study had already been completed. It is also worth noting that the 
algorithm showed that on the average the observed-score model would 
evidence less goodness of fit and lower power. 

FORTRAN COMPUTER PROGRAM 
The algorithm described aboye was implenjented as a FORTRAN program as 
one part of. th^^ overall i^^jjiearch effort. It is written in standard 
FORTRAN and, although it was run on the WATFIV compiler at New York 
University, the program can be installed without much effort on arxy 
computer system. Furthermore, the structure allows it to be modified 
easily to handle moxfe general problems, e.g., more than two predictors. 
The program is designed to be optimally useful to investigators who are 
planning a study of change (but have not yet begun data collection) • 

' As the flow charts In. Figures 1 and 2 show, the program follows the 
structure of the algorithm pr&sented above very closely. A source 

" '156 



Figure 1 
H&IN PROGRAM 




Initialize £ 



jtead low, high values 
and number of steps 



for S 



)2 and 1 



Determine increments 

^11» ^22* 

and b2 



Sat S 



Set r,, :. s 

11 - u^u, 



Set r-,- : s 



Set b. 



Set b„ 



Call REGRES 



□ 




VI-8 



157 



Figure 2 
6E6BES SUBROUTINE 




2 2 
Compute Rt, 8^ , 

^bb* • YX 


\ 


/ 


Compute 


s 

— XX 


» 

\ 


/ 


2» 

Compute b ' , R , 

2 „ t 
^e' -b'b_" • 

8 

yx 


\ 





Other Subroutines: 

HVMAT: Coiaputes matrix-vector product. 
MINV: Computes matrix Inverse. 
QUADF: Computes quadratic form. 



flERIC 



VI-9 



158 



listing of the program appears in the appendlK to this chapter. Input to 
the program consists of infornktion about the variance structure of the 
-t r ue . scor e^^— the-tr ue^score regres slon-coefflc leats, and the — 



reliabilities of the observed predlCktors. The program allows calculation 

of several sets of parameter values, thus the input specifies b range of 

values and the number of estimates to be calculated within that range. 

In the version of the program illustrated in this chapter, two 

predictors, Xj^, and X2, are permitted. Let refer to the number 

of possible values of ^ that are specified' by the input, n2 to <;he 

number of levels of r.- (s ), n, to the number of levels of r^^ 

12 

(s^^ ), n^ to the nimber of values of bj^, and n^ to the number of 

2 2 * 
values of b2« Then one run of the program produces n^^ x n2 x n^ x 

n^ X n^ combinations of solutions. 

This iterative feature was included since researchers will often have 

little confidence in their specific a priori expectations about 

true-score regression paranfeters. Frequently, however, a range oi 

i ^ 

possible parameter values can be stated with some confidence. The 
program assesses the potential 'biases for an combinations of suspected 
parameter values in a single run. Although the program is formulated in 
terms of covarianc^s, it will be used most often with standardized 
estimates. Hence, all illustrations below are given in terms of 
correlations and standard regression weights. 

Once the main program has converted the reliabilitie? into error 
variances and calculated the increments in pa'^ameter values which cover 
the prespecified range from lowest to highest values, a subroutine which 
performs the major computations is called for each combination of 



VI-10 

159 













f , 




parameter values. On each call,/ the true-score regressilm parameters 






R2, 8e2. S bb. tbi. tb2. and s are calculated first. Then the • 






covariance matrix of the observed scores, S is determined. Finally, 






* 2 ' 2 
the values of the observed-score regression parameters, b', R , s^, , 


V ':- . 


r 


S b'bS tb'i. tb'2. a°«l ^7^' ^""^ obtained and printed. The program 






terminates after' the final call to the subroutine. 






AN APPLICATION ^ 






To illustrate the use of the program, it was applied to the following 






problem. An investigator was planning a study of change in which it was 




1 


anticipated that could range between -.4 and .4, between .1 and 




■■' 


.7. r22 between .7 and .9, r^^j^ between .6 and .9, and r^^^^^^ between 


/ 




■ -.4 and .4. Within this. set of conditions, what degree of bias could be 






anticipated in the obserVed-score regression coefficients as estimators 


> 


• 


of the true-score regression parameters? The program evaluated 2x2x3 




; 


X 2 X 3 - 72 combinations of conditions and printed the results in 






Table 1. ' , - - 






Insert Table 1 about here 






Comparison of the true-score squared. multiple correlation (column 




i;.. 


R2T) with the observed-score squared hultiple correlation (column R20) 






indicates that little bias should be expected. As long as the 






reliabilities are high, R^' almost equals R^. When the reliabilities 




■ . . / 


drop, r2' underestin^tes ^2 ^o .2. Consistent with this result 




;^ - 


is the comparison of s^^ (VET) and s^.^ (VEO), which indicates 






that Sg,2 inflated only for combinations of low reliabilities 






; ... 

' . ' ' . vi-ii 160 * 




^ ERJC 




' * • . 







(r^^^ • .6, * •T)* Was in neither goodness of fit index seems 

large enough to dause the investigator much concern. Observed-score 

results concerniiig the adequa& y o f th a-jnodeJ.- shou l d 1 : )f ?i JceasonaMy-close 

to the true-score parametiars on the average. 

The major statistics of interest in the study of change are b2 and 

The columns headed TB2 and 0B2 and respectively) show that there 

is some bias. The degree of bias is as great as -.2 [-.2 - (-.4) and .2 

-.4] in absolute terms and 50% on a relative basis '(j-.2/-.4 and .2/>4). 

For all combinations except three, the observed-scorle coefficient falls 

within the -.4 to .4 range. In the null case, that Is, where b^ * 0, 

the observed-score bias is quite small across all parameter combinations. 

Examination of the t-ratlos for and b'2 (columi)is T-TB2 and T-0B2, 

respectively) reveals that the true-score and observed-score results are 

perfectly consistent. That is, there are no combinations of nonnull 

conditions for which t. is significant but t, , is not, and vice 

"^2 ''2 
versa . When the true-score regression weight equals 0, there are no 

instances where T-0B2 leads to rejection of the hypothesis that the 

observed-score regrelsion coefficient equals 0. 

Although some power will be lost as a result of janreliabiUty, our 

hypothetical researcher learns from the output of the program that 

measurement err6^ will not leaA to incorrect inferences about the effect 

of on true chan^ (f»e., about b2) on the average. Of course, chis 

was the major concern\that motivated the preliminary analysis of 

potential bias due to unreliability. In this situation the investigator 

may well decide to proceed with collecting data on Xj^, X2, and y and. 

subsequently performing a regression an^ysis of the observed scores. 



VI-12 











The advisability of -tills course of action depends upon how closely the 








hypothesized parameter values approximate the actual values In the 








population from which the researcher/ will draw the sample. In many other 








circumstances the opposite course will be decided after examining the 








, output of the program, ecg., in situations like the science achievement 








example presented earlier* 








? CONCLUSION 








. In this' chapter we have described the development of a FORTRAN 








program^ Which Is based upon an algorithm that expresses both the 








true-score and observed-score regression, parameters as functions of the 








variance structures of the true and error components. Input to the 








program consists of Information about the covarlances among the true 








predictors, the reliabllltes of the observed predictors, and the 








true-score regression coefficients. The program outputs values of the 








true-score regression parameters and those of the corresponding 








observed-score regression parameters. Comparison of the ^wo sets of 








^ parameter values allows one to assess the degree of bias likely to occur 








In observed-score regression coefficients aQ estimators of their 








true-score counterparts. Preliminary evaluation of potential errors of 






I 


. inference due to measurement error allows the Investigator to redesign 








the research #an or select new measures. Use of the algorithm and 








program is strqngly recommended, since it Xian improve the quality of 








research on the determinants of change and prevent erroneous inferences. 








* 

■.\.> ■ ^ ♦ 
IerIc 






'i 



• ; . DEHOHSmnOM OF 


tilE fiELATtONSHir DCTWCCN rARAHCTEftS OF fnUC-SGORIT AND 0DSERVED-8C0RE DISmDUTlbHS ' 
FOR SELCCTED MALUEQ OF P:Tl2f -Rlif R22r TBI AMD TD2 - ^ 




















.• 




Table 1' 
























TBI 


ODl 


6TB1 


SODl 


T-TDl 


T-ODl R22 TB2 


0D2 


STD2 


SQB2 


T-TB2 


T^2 


R2T 


R20 


VET 


VEO 


N 


• • 








...1 





-- 
























0.100 


0«096 


* 

0.099 


0.078 


liOlO 


1.223 0.7 r0.400 


-Or 281 


0.099 


0.081 


-4.042 


-3.454 


0.2 


0.1 


0.8 


0.9 


100 


-0^4 0.6 


0.100 


0.062 


0.110 


0i0fi5 


0.907 


0.735 0.7 0.000 


^0.011 


0.110 


0.088 


0 .000 


-0« 120 


0.0 


0.0 


1.0 


1.0 


100 


( -Oi4 0.6 


0.100 


0.029 


0«103 


0.081 


0.972 


0.353 0i7 0.400 ^ 


0.260 


0. 103 


0.084 


3.689 


3i093 


0.1 


0.1 


0.9 


0.9 


100 


. -0.4 0.6 
-0.4 0.6 


0.400 


0.283 


0.082 


0.069 


4.860 


4.073 0.7 -0.400 


-0.313 


0.002 


0.072 


-4.860 


-4.344 


0,4 


0.3 


0.6 


0t7 


100 


0.400 


0.249 


p. 102 


0.000 


3,940 


3.090 0.7 0.000 


-0.042 


0*102 


0.003 


OeOOO. 


-0.507 


0.2 


0.1 


0.9 


0.9 


100 


-o;4.0.6.. 


0.400 


0.215 


0.100 


O.OBO 


^ 4i017 

• 


2.600 Oi7 0.400 


0*220 


0. 100 


0.083 


4.017 


2.738 


0.2 


0.1 


0.8 


0.9 


100 


. -0.4 0.6 


0.700 


0.169 


0.039 


0.051 


17.801 


9.177 0.7 -0.400 


-0 . 34'5 


0.039 


0.053 


-10. 172 


-6.491 


0^.9 


.0.6 


0.1 


0.4 


100- 


-0t4.0.6 * 

•» 


0.700 


6.436 


0.079 


0»070 


0.840 


6.242 0.7 O5O06 


-0.074 


0.079 


•X) . 072 


0.000 


-1 . 02t 


0.5 


0.3 


0.5 


0.7 


100 i 


. -<y.4 0|6 


0.700 


0.402 


0.084 


0.074 


0.340 


5.420 0.7 0,400 


0. 197 


0.004 


0.077 ' 


4.766 


2.557 


0.4 


0.2 


0,6 


0.0 


100 


-0.4 0.6 


0.100 


0.073 


0.099 


0.07.8 


1.010 


0.937 0.9 -Oi400 


-0.370 


0.099 


0.092 


■^4 . 042 


-4 .040 


0.2 


0.2 


0.8 


0.0 


100 


-0.4 0.6 


0.100 


0»061 


0.110 


0.086 


0.907 ' 


0i716' 0.9 0.000 


-0.014 


0. 110 


O4 101 


0 . poo 


-0 . 138 


OtO 


0.0 


1.0 


1.0 


100 


-0.4 0.6 ' 


0.100 


0.050 


0.103 


0.081 


0.972 , 


0.619 0.9 0.400 


0.342 


0.103 


0.095 


3.* 889 


3.603 


0.1 


0.1 


0.9 


0.9 


100 


-0.4 0.6V. 


0.400 


0,257 


0.082 


0.068 


4.860 


3.779 0.9 -0.400 


-0,411 


0.002 


O.OQO 


-4.860 


-5. 142 


0.4' 


0.4 


0.6 


0.6 


100 


-0.4 0.# 


0.400 


0.246 


0.102 


0.081 


3.940 


3*022 0.9 0.000 


-0.056 


0.102 


0.096 


0.000 


-0.502 


0.2 


.0.1 


0.9 


0.9 


100 


-0.4 0.6 


0.400 


^0.234 


0.100 


0.080 


4.017 


2.916 0.9 0.400 


0.300 


0. 100 


0.094 


4.017 


3. 179 


0.2 


0.1 


0.6 


0.9 


100 


-0.4 0.6 


0.790 


0.441 


0.039 


0.046 


17.001 


9. 100. 0.9 -0.400 


-0.453 


0.039 


0.057 


''-10. 172 


-6. 013 


0.9 


0.7 


0.1 


0.3 


100 


••'-0#4 0#6 


0.700 


0.430 


0.079 


0.070 


8.840 


6.097 O.r 0.000 


-0.097 


0 . 079 


0>083 


0.000 


•*1 . 173 


0.5 


0r3 


0.5 


0.7 


100 


. -0.4 0.6 


0.700 


0.418 


.0.084 


0.074 


0.340 


5.642 0.9 0.400 


0.259 


0.004 


0.087 


4.766 


2i965 


0.4 


0.3 


0.6 


0.8 


100 


-0.4 0.9 


0.100 


0.129 


0.099 


0i0.91 


1.010 


1.417 0.7 -^0.400 


-0 . 272 


0.099 


0.082 


-*.042 


-3.310 


0.2 


0.2 


0.0 


0.9 


100 


-0.4 0.9 


0.100 


0.083 


0.110 


0.098 


0.907 


0.051 0»7 0.000 


-0.005 


0.110 


0.089 


0.000 


-0.052 


0.0 


0.0 


1.0 


1.0 


100 


-0.4 0.9 


0.100 


. 0.03Q 


0.103 


0.094 


0.972 


0.400 0.7 0.400 


0.263 


0.103 


0.085 


3.889 


3.088 


0.1 


0.1 


0.9 


0.9 


100 


7O.4 0,9. ' 


0.400 


0.379 


0.082 


0.078 


' 4 . 060 


4.050 0.7 -0.400 


-0r206 


0.002 


0.071 


-4.860 


-4.041 


0.4 


0.4 


0«6 


0.6 


100 


-0.4 0.9 


0.400. 


0.334 


0.102 


0.091 


3.940 


3.649 0.7 0.000 


-0.019 


- 0.102 


0.083 


0.000 


-0 . 2^?4 


0.2 


0.1 


0.9 


0.9 


100 


-0.4 0.9 


0.400 


0.289 


0.100 


0.092 


4.017 


3.142 0.7 0.400 


0.249 


0.100 


0.083 


4.017 


2.985 


0.2 


0.1 


0.8 


0.9 


ipp 


-0.4 0.9 


0.700 


0.629 


0.03? 


0.050 


. 17.801 


12.655 0.7 /0.40O 


-0.300. 


0.039 


0.045 


-10.172 


1-6.648 


0.9 


0.7 


0.1 


0.3 


100 


ERJC.v 


0.700 


0.504 


0.079 


0.075 


0.040 


7.777 0.7 0.000 


-0.032 


0.079 


0.068 0.000 

164 


-0.477 


0.5 


0.4 


0.5 


0.6 


100 





o.ido 


0.098 


•0»099 


0.090 




0.9 


OiiiOO 


0.083 


0,110 


0i099 




0.9 


•o.ioo 


0<067 


0tl03 


0.094 




0.400 


0.347 


0.082 


0,077 


N. „ 


0.9 


0.400 


0.332 


0.102 


0.093 


• . -0.4i 


0.9 


O.40O 


0.316 


0*100 


0.092 


-0.4 


».9 


Ot700 


0,596 


0.039 


0.047 


^ -0.4 


0.9 


0.700 


0.500 


0.079 


0.076 


-0.4 


Ct? 


0.700 


0.565 


0.004 


0,081 


0.4 


0.6 


0.100 


0.'029 


0.103 


0,081 


. ' . o.*i 


0.6 


oaoo 


0,062 


0.110 


0.005 


'0.4 


0«6 


0*100 


0.096* 


0.099 


0.670 


0.4 


0«6 


0,400 


0»215 


0.100 


0,080 


r 0.4 


0,6 


0.400 


0.249 


0.102 


0.080 


< 0.4 


0.6 


Ot400 


0.283 


0,082 


0,069 


Ot4 


0,6 


0.700 


0.402 


0.004 


0.074 


V ^ 0.4 


0,6 


0.700 


0.436 


0.079 


0.070 


0.6 


0.700 


0.^69 


0.039 


0.051 




0.6 


0.100 


0.050 


0.103 


0.081 


■ 0.4 


0.6 


0,400 


0.061 


0.110 


o*oq6 

% ■ 


0.4 


0,6 


0,100 


0*073 


0.099 


0.070 


> 0.4 


0.6 


0.400 


0.234 


0.100 


0.080 


X 0.4 


0.6 


0,400 


0.246 


0.102 


0.001 


0,4 


0.6 


0.400 


0,257 


0 . 082 . 


0.068 


0.4 


0^6 


0.700 


0,410 


0i004 


0.074 


0.4 


0.6 


0.700 


0.430 


0.079 


0.070 


if* 

0.4 


0.6 


0.700 


0,441 


0.039 


0.04t3 


0*4 


0.9 


0,100 


0.03Q 


0.103 


0.094 


0.4 


0.9 


o.too 


0,003 


0.110 


0.098 




0.9 


0.100 


0.129 


0.09% 


0.091 




0.9 


0.400 


0.289 


0^100 


0.092 


A #> 


A AAA 


A 


A « A** 







^6*636 01?^ 


0 i400 \ 


0V235 


OfOD4 


1.010 


1,090 0,9 


-0.400 \ 


-0.3^1 


0,099 


0,907 


O'i033 0.9 


^,.000 


1-0.006 


0.110 


0.972 


0,720 0.9 


0.400 


\ 0.348 


0.103 


4.869, 


4,510 0).9 


-0,400 


-y>.379 


0.002 


3.940 


.3.5:^2 0.9 


0.000 


-(|^.025 


0.102 


4,017 


3.442 0,9 


0.400 


0.330 


0.100 


17,001 


12.793 0,9 


-0.400 


-0.397 


0.039 


8.848 


7.615 0^9 
6.971 0;9 


OiOOO 


-0.043 


0.079 


8,340 


0.400 


0.311 


0.004 


■8.972 


0.353 0.7 


-0.400 


~0.'26S** 


^0.103 


0.907 


0.735 0.7 


0.000 


0.011 


0.110 


1 .010 


1,223 0.7 


0>400 


0.201 


0.099 


4,6X7 , 


2.600 0.7 


-0.400 


-0.220 


0.100 


3.940 


3.090 0*7 


0»000 


0.042 


0.102 


4.860 ^ 


4.073 0.7 


0.400 


0.313 


0.0Q2 


8.340 


5*420 0.7 


-0.400 


-0.197 


0.004 


0.0*40 


6.242 0.7 


0.000 


0.074 


0.079 


17.001 


9.177 0.7 


0.400 


0.345 


0.039 


0.972 


0.61? 0.9 


-0.400 


-0.342 


0.103 


0.907 


0.716 0.9 


0.000 


0.014 


0.110 


1 ,010 


0.937 0^9 


0.400 


0.370 


0.099 


4.017 


2.916 0.9 


-0.400 


-0.300 


0.100 


3.940 


3.022 0.9 


0,000 


0.056 


0.102 


4.860 


3.779 0i9 


0.400 


0.411 


0.002 


8.340 


5.642* 0.9 


-0,400 


-0.259 


0.004 


8.040 


6. 0^7 0,9 


0.000 


0.097 


0.079 


17.801 


9.100 0.7 


0.400 


0.453 


0.039 


0.972 


0.409 0.7 


-0 . 400 


> 

-0.263 


0.103 


O.9O7L, 


0.051 0.7 


0.000 


0,005 


o.iio 


i.OiO 


li419 0.7 


0.400 


0.272 


0.099 


4.017 


3.142 0*7 


-0.400 


-0.249 


0.100 






« t • * 







0.074 


4.766 


3 •187 


0»4 


0»7 


016 


180 


0,093 


-4i042 


-3i083 


0.2 


Oi 


2 


0,80. 8 


100 


0.102 


0.000 


-0.060 


0.0 


0. 


0 


1.0 


1.0 


10,0 


0.096 


3.889 


3.613 


0.1 


0. 


1 


0.9 


0.9 


100 


0.079 


-4.860 


-4.786 


0,4. 


0. 


4 


0.6 


0.6 


100 


0.096 


0.000 


-0,258 


0,2 


0. 


1 


0.9 


0.9 


100 


0.095 


4.017 


3.489 


0.2 


0. 


2 


0.8 


0.9 


100 


0.048 


-10.172 


-8.294 


0.9 


0. 


8 


0.1 


0.2 


100 


(r.o?8 


^00 


-0.549 


OiS 


0, 


4 


0.5 


0^,6 


100 


0.003 


4.766 


37734 


0.4 


0, 


3 


0.6 6^7 


100 


0.084 


-3 . 889 


-3.093 


0.1 


0, 


1 


0.9 


0,9 


100 


0.088 


0.000 


0.120 


0.0 


0, 


0 


1 . 0 


1 , 0 


100 


0.081 


4.042 


3.454 


0.2 


Oi 


1 


0^8 


0.9 


100 


0.083 


-4.017 


-2.738 


0.2 


0« 


1 


0.8 


0,9 


100 


0.083 


0,000 


0.507 


0.2 


0 


1 


0.9 


0,9 


100 


0.072 


4.860 


4,344 


0.4 


0< 


3 


0.6 


0.7 


100 


0.077 


-4.766 


-2,557 


0.4 


0 


2 


0.6 


0,8 


100 


0.072 


0.000 


1.021 


0.5 


0 


.3 


0.5 


0,7 


100 


0.053 


10.172 


6.491 


0,9 


0 


6 


0.1 


0,4 


100 


0.095 


-3.889 


-3.603 


0.1 


0 


» 1 


0.9 


0.9 


100 


o.idi 


0.000 


0.130 


0.0 


0 


lO 


1.0 


1.0 


100 


0.092 


4.042 


4,040 


0.2 


0 


.2 


0.8 


0»8 


100 


0.094 


^ -4.017 


-3.179 


0.2 


0 


. 1 


0.8 


0.9 


100 


0.096 


0.000 


0.582 


0.2 


0 


.1 


0.9 


0.9 


100 


0^080 


4.860 


5.142 


0.4 


0 


,4 


0.6 


0.6 


100 


0.007 


-4.766 


-2.965 


0.4 


0 


.3 


0.6 


0.8 


100 


0.083 


0.000 


1 .173 


0.5 


0 


.3 


0,5 


0.7 


100 


0 . 057 


10.172 


8.013 


0.9 


0 


.7 


0.1 


0.3 


100 


0.005 


-3.889 


-3,088 


0.1 ^ 


.1 


0.9 


0.9 


100 


0.089 


0.000 


> 0.052 


0.0 


0 


,0 


1.0 


1.0 


100 


0.082 


4,042 


3.310 


0.2 


0 


.2 


0.8 


0.9 


100 


0.083 


-4.017 


-2.985 


0.2 


0 


.1 


0.8 


0.9 


100 



i66 





0*1 






OiOUy 


01070 


4 


060" 


"'4 




Of 


\f 




400 " 


Ori!d6 




' Of 071 


""4 #060 


4.041 


or4 


0.4 


Uf 6 




lOff" 


0i4 


0«9 


4 0.700 


0 #539 


Oi004 


O.OQl 


0 


340 


6.636 


Oi 


7 


-0. 


400 


•0.235^ 






0.074 


-4f766 


-3.107 


0.4 


0.3 


0.6 


0.7 


100 


0*4 


0.7 


0f700 


Of 504 


0.079 


0.075 


0 


»040 


7 


• 777 


0. 


7 


0 


000 


0.032 


0« 


079 


0.060 


0.000 


0.477 


0.5 


Of 4 


0.5 


0.6 


100 


0*4 


0«7 


Of 700 


Of 629 


Of 039 


0.050 


17 


lOOl , 


1*? 


. Cl3 .J 


0. 


7 * 


0 


400 


0.300 


0. 


039 ' 


0.045 


10.172 


6.640 


0.9 


0.7 


0.1 


0.3 


100 


0.4 


0«9 


Oi 100 


0 f 067 


0.103 


Of 094 


0.972 


(\ 


. / J. 


Oi 


o 


-0 


400 


-0i34P, 


0 1 


103 


0.096 


-3f009 


-3.613 


Of 1 


0.1 


0f9 


0f9 


100 


0«4 


0«9 


Of 100 


0.003 


0. 110 


0.099 


0 


907 






Oi 


Q 


0 


000 


0.00 J 


0 1 


110 


0.102 


Of 000 


0.060 


OfO 


0.0 


hfO 


IfO 


100 


, 0.4 


0«9 


Of 100 


Of 090 


0.099 


Of 090 


1 


1 010 


1 

X 




Oi 


9 


0 


400 


0i361 


0 


099 


0.093 


4.042 


3f003 


0.2 

■=3 


0t2 


OfO 


0.0 


100 


0«4 


0.9 


Of 400 


0.316 


Of^O 


Of 092 


4 


.017 


3 


.442 


0. 


9 


-0 


400 


-0.330 


• 0 


100 


0.095 


-4.017 


-3.409 


0.2 


Oi2 


0.0 


Of 9* 


100 


0.4 


Of 9 


0f400 


0.332 


0.102 


0.093 


3 


.940 


3 


.572 


Oi 


9 


. 0 


000 


0.025 


0 


102 


0.096 


0*000 


0.250 


0.2 


Of 1 


0.9 


Of 9 


100 


0«4 


Of 9 


0.400 


0.347 


Of 002 


0i077 


4 


i060 


4.510 


Oi 


9 


0 


400 


0.379 


0 


002 


Of 079 


4f060 


4.706 


0.4 


0f4 


0.6 


0.6 


100. 


• ./0.4 
V 0.4 


0.9 


0.700 


0.565 


Of 004 


0.001 


0 


.340 


6 


.771 


0. 


9 


-0 


.400 


-0.311 


vO 


004 


0.003 


-4i766 


-3f'734 


0.4 


Of 3 


Of 6 


0.7 


100 


Of?^ 


0.700 


0.580 


0.079 


Or 076 


0 


.040 


7 


.615 


0 


9 


0 


.000. 


0.043 


0 


.079 


0.070 


0.000 


0.549 


0.5 


0.4 


0.5 


Of 6 


100 


0f4 


0f9 


0.700 


0.596 


0.039 


0.047 


17 


.001 


12 


.793 


0.9 


0 


.400 


0.397 


0 


.039 


0f0'40 


10.172 


0.294 


0.9 


0.0 


Of 1 


0f2 


100 



•MOTATION KEY? 

RT12 - CORRELATION DETWEEN TRUC XI AMD TRUE X2f Rll - RCLIADILITY Of ODSCRVED XI f TDl - TRUE RtOREGSlON OF XI ON Y> 

OPl - ODGERVED REORESGION OF. XI ON Y> STDl - STANDARD ERROR FOR TRUE Dlf GODl - STANDARD ERROR FOR ODSERVED Dl» 

T-TDl - T VALUE FOR TRUE Dl? T-ODl - T MALUE FOR ODGERV'ED Dl? R22 * RELIABILITY OF TODSERUED X2? 

TD2 - TRUE RECRESOION OF X2 OM YT 0D2 - ODSERMED RECRECOIOH OF X2 ON Yf STD2 - STANDARD ERROR FOR TRUE D2> 

S0D2 - STANDARD ERROR FOR OSERVED B2f T-TD2 - T MALUE FOR TRUE D2? T- 0D2 - T VALUE FOR ODSERVED 02? 

R2T - SQUARED MULTIPLE CORRELATION FOR TRUE SCORES? R20 - SQUARED MULTIPLE CORRELATION FOR OBSERVED SCORESI 

VET - MEAN SQUARE ERROR FOR TRUE REGRESSION? VEO - MEAN SQUARE ERROR FOR OBSERVED REOREGSIONf N - SAMPLE SIZE 

HOTEI FOR ALL PARAMETERS BASED ON REPEATED DRAWINGS OF SAMPLESf IiE.r STDl fSODI r T-TBl rT-ODl rSTDSr S0D2f T-TD2pT-0B2f 
VETf AND VEOf A CONSTANT N OF 100 WAS ASSUMED. DFE « 97 WAS USED IN ALL CALCULATIONS. 



A FORTRAN COMPUTER PROGRAM FOR PERFORMING THESE KINDS OF CALUATIONS IS AVAILABLE FROM DR. NOEL DUNIVANTf ASSISTANT 
PROFESSORf PSYCHOLOGY DEPARTMENTr NEW YORK UNIVERSITY* 6 WASHINGTON PLACEr 7TH FLOORr NEW YORKr NEW YORK 10003. 



16/ 



ERIC 



0 



APPENDIX TO CHAPTER VI 



\ 



163 



IjOB G89Kl309,Li.NES«66,N0LlST,N0HARN,NCEXT 

DIMENSION STXC2»2l,SiyUC2,2l»TBC2I.SVUt2) 
C INITIALIZE FIXED PARAMETERS 

DO 5 I*li2 

STXCItI)«UO , 
5 «VUC 11*0.0 
STY* 1,0 
SVV»0.0 
5UU(lt2J«0.0 
SUUf2tll»0.0 
N«100 

C READ INPUT TRUE PARAMETERS AND OBSERVED RELIABILITIES 

. REAOiS.lOll S12LO,S12HI.S12N,R11LO,R11HI,R11N,R22LO.R22HI 

tR22NtBlLOf81HItBlNfB2L0tB2Hlff62N 

101 FORMAT C15F5,0 J 
WRITE! 6 t 102) 

102 FORMAT ClHl I 
S12INC-<S12HI-S12L0I/IS12N-1) 

STXI1»2I«S12L0-S121NC 
X - RlIlNe»iRUHl-RllLOJ/(RllN-l) 
R110*R11L0-R11INC 
R22INC»CR22HI-R22L0)/IR22N-I) 
R220»R22L0-R22 INC 
BUNC«IB1HI-B1L0J/(B1N-1I 
TB 10=8 ILO-Bl INC 
B^TNC=CB2HI-B2L0)/«B2N-tJ 
TB20=B2L0-B2INC 
' NS12=S12N 
NRll^RllN 
NR'22«R22N 
NBI-BIN 
NB2=82N 
00 20 I=1,NS12 
STXC1.2)=STXCrf2J*Sl2INC 
STX(2tl}»STX( It2) 
op 20 J=l.,NRll 
IF(J-EQ«1IR11=R110 
R11«R11 ♦ RIIINC 

SUUClf ll=CSTX<l,II-Rll*STXll.lil/Rll 

00 20 K«ltNR22 ^ ; 

IF(K,EQ«1IR22=R220 | 

R22»R22*R22INC I 
SUaC2.2l»(STXl2,2l-R22*STXI2,21l/R22 

DO 20 L»ltNBl 

ifcl,eq;i)tbui=tbio 

TBCll-TBdI+BlINC \ 
DO 20 M«1,NB2 
IFtM,EQ.lJTB(2l=TB20 
TB(2I«TBI2)*B2INC 

CALL REGRES I STX . SUU^TB, STY , SVV . SVU f Rl 1 1 R22 1 N ) . 
20 CONTINUE 

STOP . 

-— - END 

SUBROUTINE REGRESCSTX. SUU.TB, STY. SVV#SVU.RUtR22.NJ 
DIMENSION STX12,2).SUU<2.2ITB(2I.SVUI2I,STXINV(2.2J.STB(, 

♦ •TTI2lfSaxC2,2).STYXI2J,SOXINVI2-.2lfS0YXI2),SOBI2,2J. 

.OBI2)ffOTI2}rQSTB(2}tQS0BI2) 
2N«N 

COMPUTE TURE-SCORE l»ARAMETERS 
COMPUTE R SQUARg TRUE 




CALL QyADFITBtSTXtRSQI 

R2TY«RStf/STV > 
C COMPUTE MSE TRUE 

0F=Zrt-'3 ■ •• - . • 

. STE«STy*l2N/pFI*(l-R2TY^I 
C COMPUTE TRUE B VECTOR 

CW-L MINVCSTX.STXIWVI 

00 5 I«lt2 

00 5 J*l»2 , 
5 STB(IrJl*STE*ll/ZNI*STXlNVCI»J) 
C COMPUTE TRUE T-TESTS 
00 10 1-1 t2 

OSTBI I I«SQRT ( 5TB( I« I n 
u 10 TTI!I»T8(I)/QSTB(II 
C COMPUTE STYX - TRUE YX COVARIANCE VECTOR 

CALL MVMAT(STX.TB,STYX> 
C COMPUTE OBSERVED-SCORE PARAMETERS 
C COMPUTE SOX - SIGMA OF OBSERVED SCORES 

DO 11 I*lt2 
Sii 11 J«it2 
11 SOXUf JI»STXII»JI+SUUU»J) 
C COMPUTE SOXINV 

' CALL XINVCSOX. SOXINV) 
e COMPUTE OBSERVED YX COVARIANCE VECTOR } . 

00 15/I»lt2 . ' 

15 SOYm)*STYX(I)*SVU(II 
C COMPUTE OBSERVED-SCORE REGRESSION VECTOR 

CALL MVMATISOXINV.SOYX,OB) 

OMPUTE OBSERVED Y VARIANCE 
:SOY«STY ♦ SVV 
C COMPUTE OBSERVED R SQUARE 

CALL «UAOF{OB,SOX,RSQI 
R20Y=RSQ/S0Y 
C COMPUTE SOE - MSE FOR OBSERVED SCORES i 

SOE» SOY* ( ZN / OF ) * ( l-R 20 Y I 
C COMPUTE SIGMA OF OBSERVED B VECTOR 

>0O 20 1=1*2 
|)0'20 J*lt2 
20 SOB(Ita)«SOE*tl/ZNJ*SOXINV(ItJ) 
C COMPUTE OBSERVED T-TESTS 

DO 25 I«lt2 

QSOBCII«SQRT(SOBCItin . ^ 

• SJnEuaJillTSall^ 

,R22fTBI2l»OBI2),QSTi(2ltCS0BL2l,TTI2)»OTC2J,R2TY,R20Y, 

lOo'FORMSu"X.lF4.l,lX.lF3.1.1X,4ClF7.3,lXJ,2llFfl,3,lX),iF3.1. 
,4«1F7,3,1X),2C1F8.3,1X),441F3.1,1X).1I3J 

RETURN 
STOP 

' END ' * ' 

^ SUBROUTINE MVMAT(X.BfXB) 

C MATRIX-VECTOR PRODUCT y 
DIMENSION X(2.2)»B«2I.XBC2I X 

- 00 5 I*lf2 

XBft)a*0«0 
, ,D0 t J»lt2 

5 XBL4»«XBin«-B([J)*XlJtI) 
RETURN - 

171 



SUBROUTINE MINVCX.XINV} 
C ~ . INVERT^A 2X2 MATRIX 

DIMENSION X(2f2lfXINVl2«2l 

|)«X(ltl>*XC2t2HXIl,?l*XI2tU 

XINVII»11»XI2,2)/D 

XIN\if(2,2l*Xlltll/0 

XINVat2i»-Xll,2l/0 

XINVI2tl)«XINV(lf2l 

RETURN 

STOP 

END . 

SUBROUTINE OUADF(BtStRSQl 
C COMPOTE A QUADRATIC FORMS RSQ 

DIMENSION B(2}tSC2t2)tYC21 

DO 5 J»lt2 

YCJl^O.O 

00 5 K«lt2 
5 YUl^YCJ) B(K)*SfK,J) 

00 6 J«lf2' 
€ RSQ>YCJI*BI J) 

RETURN 

STOP 

END 

SENTRY ^ , 

-.4 ,4 2 .65 *85 2 .7 

SSTOP 

/* 

// 



ERIC 



REFERENCES 



Adcock, R. J. A problem in least squares. Analyst , 1878, 5^, 53-54. 

Aigner, D. J« Regression with a binary independent variable subject to errors 
4, of observation. Journal of Econometrics , 1973, 1, 49-60. 

Aigner, D., & Goldberger, A. S. (Eds.) Latent variables in socio-econoiyiic 
models . Amsterdam: North-Holland Publishing Co., 1976. 

American Psychological Association. Manual for educational and psychological 
tests and manuals . Washington, D.C.: American Psychological Association, 
196b. 

Asher, H. B. Some consequences of measurement error in survey data. American 
Journal 6f Political Science , 1974, 18, 469-485. 

Baltes, P. B., & Nesselroade, J. R. Cultural change and adolescent 

personality: An application of Ibngitudinal sequences. Devejlopmental 
Psychology , 1972^ 7_, 244-256. 

Bartlett, M. S. On the theo?:y of statistical regression. Proceedings of the 
Royal Society of Edinburgh , 1S33, 53, 260-283. 

Bereiter, C. Some persisting dilemmas in the measurement of change. In C. W. 
^ Harris (Ed.), Problems in measuring change . Madison: University of 
Wisconsin Press, 1963. 

Bergman, L. R. Some univariate models in studying change (Supplement 10). 
Stockholm: Psychological Laboratories, University of Stockholm, 1971. 

Blalock, H. M. Path coefficients versus regression coefficients. American 
Journal of Sociology , 1967, 72^, 675-676. 

Blalock, H.M* Multiple indicators and the causal approach to measurement 
^ error. American Journal of Sociology , 1969, 75^, 264-272. 

Block, J. The equivalence of measures and the correction for attenuation. 
Psychological Bulletin , 1963, 6£, 152-156. 

Bock, R. D., & Petersen, A. C. A multivariate correction for attenuation. 
Biometrika , 1975, 6£, 152-156. 

Bohrnstedt, G. W. Observations on the measurement of change. In E. F. 

Bprgatta & G. W. Bohrnstedt (Eds.), Sociological methodology-1969 . San 
Francisco: Jossey-Bass, 1969. ^ 

Bohrnstedt, G. W. & Carter, T. M.' Robustness in regression analysis. In H. L. 
Costner (E<i.), Sociological methodology-1971 . San Francisco: 
Jossey-Bass, 1969. . 



173 



Box, 6. £• P., & Muller, M. E. A not^e ou the generator! of random normal * 
deviates. Annals of Mathematical Statistics ^ 1958, 29^, 610-611. 

Brewer, M. B. , Campbell, D. T., & Crano, W. D. Testing a single-f actbr model 
as an alternative to the misuse of partial correlations in hypothesis- 
testing research. Sociometry , 1970, 33^, 1-11. 

Browne, W. A comparison of factor analytic techniques. Rsychbmetrika, - 

1968, 33, 267-334. ^ 

Busemeyer, J. R. Importance of measurement theory, error theory, and 
experimental design for testing the significance of interactions. 
Psychological Bulletin , 1980, 8a, 237-244. 

Campbell, B. T. Reforms as experiments. American Psychologist, 1969. 24. 
409t429. ~ ' — ' 

Campbell, B. T., & Boruch, R. F. Making the case for randomized assignment to 
treatme'nts by cons^idering the alternatives: - Six ways in which quasiexperi- 
mental evaluations in compensatory education tend to underestimate effects. 

\ In C. A. Bennett & A. A. Lumsdaine (Eds.), Evaluation and experiment . New 
York: Academic Press, 1975. ' 

Campbell, D. T., & Erlebacher, A. How regression artifacts in quasi-experi- 
mental evaluations can mistakenly make compensatory education look harmful. 
In J. Hellmuch (Ed.), Compensatory education: A national debate. Vol. 
III. The Blsadvan-^iag^d child . New York: Brunner/Mazel, 1970^ 

Campbell, D. T., & Staniey, J. C. Experimental and quasi-experimental desig ns 
for research . Chicago: Rand McNally, xy^b. ' 

Cleary, T. A., & Linn, R. L. Error of measurement and the power of the 

statistical test;. British J ournal of Matliematical and Social Psycholocv 

1969, 22^, 49-55.' ' " 

Cochran, W. G. Errors of measurement in statistics. Technometrics, 1968 10 
637-666. — 

Cochran, W. G. Some effects of errors of measurement on multiple correlation. 
Journal of the American Statistical Association , 1970, 65, 22-34. 

Ccfchran, W. G. Some effects of errors of measurement on linear tegression. 
Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and 
Probability , I, 1971, 527-5^9. ' ; 

Coh^n, J. Statistical power analysis for the behavioral sciences (Rev, ed.). 
New Yorkl Academic Press, 1977. " ' 

Cohen, J., & Cohen, P. Applied multiple regression/correlation analysis f cyr 
the behavioral sciences . Hillsdale, N.J.: ' Lawrence Erlbaum Associates, 
1975. ' ■ 



17 i 



Corder-*Bolz, C. R. the evaluation of change: New evidence. Educational and 
Psychological Measurement > 1978, 38, 957-976. 

Cronbach, L. J. Research on classrooms and schools: Formulation of 

questions, desigti, and analysis (Occasional Paper). Stanford, Cal.: 
Stanford University, Stanford Evaluation Gdnsortium, School of Education, 
1976. 

Cronbach, L. J., & Furby, L. Sow we should measure change" — or should we? 
Psychological Bulletin, 1970, 74, 68-80.. See also Errata, ibid . , 1970, 
74, 21&J 

Cronbach, L. J., Gleser, G» C., Nanda, H., & Rajaratnam, N. The dependability 
of behavioral measurements: Theory of generalizability for scores and 
profiles . New York: Wiley, 1972. 

Qronbach, L.J., & Meehl, P.E. Construct validity in psychological tests. 
Psychological Bulletin , 1955, 52^, 456-473. 

Cronbach, L. J., Rogosa, D. R. , Floden, R E., & Price, G. G. Analysis of 
covariance in nonrandomized experiments: Parameters affecting bias . 
Unpublished manuscript, Stanford Educational jConsortium, Stanford 
University, 1977. 

Gureton, E* E. On certain estimated correlation functions and their standard 
errors. Journal of Experimental Education , 1936, j4, 252-264. f 

Davidson, T. N. Yo uth in transition. Vol. IV: Evolution of a strategy for 
longitudinal analysis of survey panel data . Ann Arbor, Mich.: University 
of Michigan, Survey Research Center iy/2. 

Dearborn, W. F., Rothney, J. M. , & Shuttleworth, F. K. Data on the growth 
of public school children. Monographs of the Society for Research in 
Child Development , 1938, J^o^VT. [ , 

DeGracie, J. S. Analysis of covariance when the concomitant variably is 
measured with error (Doctoral dissertation, Iowa* State University of 
Science and Technology, 1968). Dissertation Abstracts , 1969, 29 J.3^8B> 
(University Microfilms No. 69-04I2F7 "T 

DeGracie, J. & Fuller, W. A. Estiinatlon of the slope and anal3fsis of 

covariance when the concomitant variable is measured with error. Journal 
of the American Statistical Association , 1972, 67^, 930-937. 

Draper, N., & Smith, H. Applied regression analysis . New York: Wiley, 1966. 

DuBois, Pi H. Multivariate correlational analysis . New York: ^ Harper, 1957. 

Dunivant, N. Review of least^ squares procedures for measuring change ( Researcll' 
Monograph). Austin: University of Texas, Research atd Development* Centex 
for Teacher Education, 1975. . ' 



Dunlvant, N. Statistical problems In the assessment of sex role development. 
197" P^^^p'^^'i^t "leetings of the Eastern Psychological Assoclatlo. 

^""^^"ts on overall and Wo«H« ard's recommendations .on .orn^n»' 
'^^^ nonra.ndom assignmen t in the analysis of covariance . 
Unpublished manuscript, New York Uplverslt^, 19771 — (b) ] ' 

Dunivant, N. Estimating latent score regression when measurment error 
variances are kno>n.. Presented at meetings of the Eleventh Annual 

N!?!r;arc^l578^'°(:^^" °' «-^^^Bh,. 

Dunivant. N. A comparison of statistical methods that correct for errors of 
measurement. Presented .at laieetlngs of the Eastern Educational Research 
Association. Williamsburg, Va.. March 1978. (b) Research 

Dunivant. N. Problems daused by measurement error in analyzing change 

Presented at meetings of the Aanerican Educational Research Association 
San Francisco. April 1979. (a) ABBocxatxon. 

Dunivant. N. Procedures for determining the equivalence of measures. 

Presented at meetings of the American Educational Research Association 
San Francisco. April 1979. (b) »ui.xation. 

Dunivant. N. Corr4#ing regression analysis for errors of measurement: A 
Ro«P«r ^* Presented at meetings of the American Educational 

Research Association. Boston. April 1980. (a) 

"""^I^M^^V* ^^^^^"g'^^Pt^y °° the .effects of m easurement error on lln..r 

istical models. Unpublished manuscript. New York Un iversity. 1980. (b) 

°^^Jo..r;J* T^''^ "^'^"^ i° correlation and regression. 

Journal of the American Statistifcal Association . 1945. 40. 493-503. 

Elashoff. J. D. Analysis of covariance: A delicate instrument. American 
, Educational Research Journal . 1969, 6, 383-402. 

^^^""1^1:2' I ^^^^^^'^^^i"' E. J. Misuse of analysis of covariance when „ 
1968' 69' 225S4! "''^''^^'^ confounded. Psychological Bulletin . 

^^""'nllu f ^"^•^'^^^^^ "hlch lead to linear regressions. Proceedings o f the 
Berkeley Symposium on M athematical Statistics and Probability. Berkeley: 
university of California Press. 1^4$. : =^ uerKeiey. 

Forsyth R.A.. & Feldt. L.S. An investigation of empirical sampling dis- - " 
tributions of correlation coefficients corrected , for attenuation. 
Educational and Psychological Measurement . 1969. 29. 61-72. 

Forsyth. R.A.. & Feldt. L.S. Some theferetical and empirical results ' 
lllt^^t ^°/^Nemar's test that, the population correlation coefficient 
JourL1! ^9^°0^°^^ An.erican Educational Pes.^r.H 



176 



V 



^^^^V-^tjr * ^Vt Developmeatial methodnl n.v. a revised 
firlmer. Minneapolis, Minn.: Bu rgess, 15 7 6. 1 revl^ecj^^ 

^"^^!«H.?*^*p ^^°P"^^«^ °f ^^"'e estimators for the errors-ln-varlables 
model. Paper presented at meetings of the Econometric Society n!u 
, Orleans, December 197^ "•ecnc society, New 

^"^^m-m.* ^^Sresslon analysis for sample survey. Sankhya . 1975, 37C. 

"' In J; sS"^!:^ J9^orin"^r °" ^^^^^^ 
Fuller ;*>W. A. Estimation of measurement prmr ',^r.^«i « 

statistical Association. Busi n ess a nd jconomics Statistics M ^^ ^^ — 

Fuller W. A, & Hldlroglou, M. A. Prop/rtles of the correction for 
attenuation estimator. P,^oceedlngs of the Ampr ^..n statistlcfr 
Aasociatlon. Social s..^^ 8tics Section. -Par^ ■ X^l^ ^j ^^l^ 

Fuller w. A., & Hldlroglou, M. A. Regression estimation after correcting for- 
attenuation. Journal of the American Statistical a.^.m ^"^^0%^^^^ 

Games, P. a. Confounding problems in multlfactor AOV when using several 
organismic variables of limited rellatlhUrv An,..^^ r.j . 
Journal . 1975, 12, 225-232? J^eJ-iaWblity. American Education Research 

Goldberger, A. S. Structural equation models: An overview, m A. S 

"'"voii; 'M^^a," u"°?0°"°° " "^^1" (Vol. 1,. New 

Bandlton. B. L. A Monte C.rlo teat o£ the robuatnesa of parametric and 
■ JoSrJIS T^>'f ° °^ ■=o»arlance agalnat unequal regreaa^on Xea. 
Journal of the American Statlat lcal. Aaaodatlon. ig7^ , |i. 864-869? V • 

Hammeralej,, J.M., S Handacomb. D.C. Monte Carlo .ethoda . Methuen, 1964. . 
"°°t:''^o^.:,"lcL^:^fp;;atl977!""°""^ """'^^ " 



O R - 5 .1 '7 

ERIC ^ 17/ 



.V 



""published d'|2ii2i£i2» of r..r ^' ' 



Hilton T T 
components. 



I^sla of a „ ' 



Ing 



tee, 



ress 



principal 



Hsu T C r ' 

treatment grouna ^^^e effe,^ . * ^^7-441 40c 

presented "P°n the p-^ '^'^ °f differ«« ' ^ 

. '=^'*8e and Ve.^ °^^«'*f6ment or I ' 

research de2f°°^^°^edproced.?5 Part f . n°^^ i^Upplyinc. 



*'ew York: m„o 

^^^^Ssir^f..* 197f ^or an.7 ' *''=^'^«^Hlll. 1972 
» 5/ 239-9i;i analysis 

Psvfhil?;. Statistical .... . ^^'^"ctur.s 



- - Statl • " 

^^^^Shometrika foS^^^ analysis of 



system. T„ , ^ 8?°eral method f ' 



R - 6 



lERjC 



178 





^ 1964. |5';^f 'JI research 

^^^^J^cSt^^^ (Report '-^. 326.329. 



' - P^idktai ^ ""''-y. — '^^^^t^'i^'^y of the 



Kendall m ^ - ^* Cambridge- „ 

fen, " . ^".I, 



R - 7 



17j 



ERIC 



Koopaans. t.C. Llneat^r^esll 
^ erven f. aoha N.v;. i^T 

Kiuanzel C ¥i a a j 



"™ R L S Sll d w"l.P-=«e„t,. 1947, 9. 218-244. 



Lord F. M. The measurement of growth . ^ ' ' 

Lord F.M. A Significance test for the h " ' 

the same trait excenf f^Z Hypothesis that two vat-^^ki 

207-220. '"-^ °f maas„ra,.aat. Js^cLIet^S'^lsir"!! 

Lord. F.M. Significance test for a n«r^. i ' 

attenuation. EducatWa^^ ' 
Lord ~ isycnologi cal Measurement . 1974. 34. 211-220. 

"i^^^h^ioj^^ Educ^^lonal^ 



Lord 



Press. i^-g-J: ~^ Chang e. Madison. Wi: University of 



Harris (Ed..), 
Wisconsin 

Lord, F.M., & Novick MB Cf * 




ERIC 



R - 8 



r««t theory , 

"^'v Journal of °? °^ st^a^|,h^ i* ^^^» 27 osi o« 

Maid J:,' ». J. . „ ^ "-'=8. °' "-l^lon 



analysig of 

McKaMr, ().„„■. ^"'"action, p., , -— _iS 



z.' -^^ 0-383. -^-^iiiSi^ Che a„ «="eratlne n^,^ ' 



-9 



Nicewander, A., & Price, J. M. Dependent variable reliability and tests of 
significance. Paper presented at the Annual Meeting of the Psychometrika 
Society, Chapel Hill, N.C., June, 1977. 

O'Connor, E. P., Jr. Eistendijag classical test theory to the measurement of 
change* Review of Educational Research , 1972, 42_, 73-97. 

Odell, P. L., & Feiveson, A. H. A numerical procedure to generate a sample 
covariance matrix. Jouranl of th e American Statistical Association, 1966. 
61, 199-203. ^ ' 

Olejnik, S. F., & Porter, A. C. Bias and mean square errors of estimators as 
criteria ^or evaluating competing analysis strategies in. quasi-experiments, 
Jo uranl of »:d ucational Statistics , 1981, £, 33-53. 

overall, J. E., & Klett, C. J. Applied multivariate an alysis. New ^ork: 
McGraw-Hill, 1972. . 

Overall, J. E., & Woodward, J. A. Nonrandom assignment and the analysis of 
covariance. Psychological Bulletin , 1977, 84_, 588-594. (a) 

ti ' 

Overall, J. E., & Woodward, J. A. Common qaisconcepti^ns concerning the 
analysis of covariance. Multivariate Behavioral Research, 1977, 12, 
171-185. (b) ^ ' f — 

Porter, A. C. The effects of using fallible variables in the analysis of 
covariance (Doctoral dissertation. University /of Wisconsin, 1967). 
Dissertation Abstracts , 1967, 28, 3517B. (University Microfilms No. 
67-12147) '. ^ 

Porter, A. C, & Chibucos, T. Rr Sheeting analysis strategies. In G. D. 
Borich (Ed*), Evaluating educational programs and products . Englewood 
Cliffs, N.J^i Educational Technology Publications, 1974. 

Pearson, K. On lines and planes of closest fit to systems of points in space. 
Philosophical Magazine , 1901, 2^, 559-572. 

Peckham, P. D. The robustness of tae analysis of covariance to heterogeneous 
regression slopes. Paper presented at the Annual Meeting of the American 
Educational Research Association, Minneapolis, March, 1970. 

Richards, J. M., Jr. A simulation study of the use of change measures to 

compare €^ciucational programs. American Education Research Journal, 1975, 
12, 29^^311. ~ 

Rlndskopf , D. M. A comparison of various regression-correlation methods for 
evaluating nonexperimental research (Doctoral dissertation, Iowa State 
University of Science and Technology, 1976). Dissertation Abstracts 
International , 1977, 37, 4120B» (University Microfilms No. 77-01, 471) 

Rogosa, D. Measurement error and the Johnson-Neyman technique. Paper 

presented at the Annual Meetings of the American Educational Research 
Association, New York, April, 1977. (a) 



ERIC 



R - 10 1 8 iC 




Rogosa, D. Some results for the Johnson-Neyman techniqu e. Doctoral 
dissertation, Standford University, School of Education, 1977. 

Rozeboom, W. W. Dynamic analysis of multivariate process data . Unpublished 
manuscript. University of Alberta, Psychology Department, January 1977. 

Rubin, B. Assignment to treatment group on the basis of a covariate. 
\ Journal of Educational Statistics , 1977, 2_, 1-26. 

Saunders, D. R. Partial reliability coefficients applied to the overcor rec- 
tion problem (RM 51-16). Princeton, N.J.: Educational Testing Service, 
1951. 

Schmidt, P. Econometrics . New York: Marcel Dekker, 1976. 

Schreider, Yu. A. The Monte Carlo method: Jhe method of statistical trials . 
Oxford: Pergamon, 1966. (Translated from the Russian by G. J. Tee.) ^ 

3hen, E. jthe standard error of certain estimated coefficients of correlation. 
Journal of Educatioijtal Psychology , 1924, 15, 462-465. 

Snedecor, W. , -Goehiran, -W.- G. Statistical mgrhods" (6th ed.). Amas, Iowa: 
University of Iowa i^ress, 1967. 

SBrbom, D. An alternative to the methodology for analysis of covariance. 
Psychometrika , 1978,^ 43, 381-396. 

Spearman, C. The proof and measurement of association between two things. 
American Journal of Psychology , 1904, 15, 72-101. (a) 

Stebbins, L. B., Bock, G., & Proper, E. C. Educational as experimentation: A 
planned variation model "(Vol. IV-B) . Cambridge, Ma^^s.: Abt Associates, 

Inc., 197"/: : . — 

Stebbins, L. B., St. Pierre, R. G., Proper, E. C, Anderson, R. B., & Cerva, 
T. R. Education as experimentation: A planned variation model (Vol. 
IV-A) . Cambridge, Mass.: Abt Associates, Inc., 1977. ^ 

Stocking, M. , & Lord, F. M. AUTEST — Program to perform automated hypothesis 
tests for nonstandard problems (RM-74-25). Princeton: Educational 
Testing Service, 1974. 

Stouffer^ S. A. Evaliiating the effect of inadequfitely measured variables in, 
partial correlation analysis. Journal of the American Statistical 
Associati on, ]|936, 11, 348-360. (a) 

- Stouffer, S. A. Reliability coefficients in a correlation matrix. 
Psychomet:rika ,-^936, 17-20. (b) 

sr. Pierre, R„ G., & Ladner, R. Correcting covariates for unreliability: . 

Dp^s it l€tad^ to differences in an evaluator's conclusions? rpaper pr.e^^nted 
at the me€5ting of the American Educational Research Asscciation, New't'ork, 
April 1977. 



18j ^ 

R - 11 



Stroud, W. F. Comparing conditiotial distributions under measurement errors 
of known variances (Unpublished doctoral disc^ertation, Stanford 
University, 1968). Dissertation Abstracts International , 1969, 29, 
2673B. (University Microfilms No. 69-00299) [ 

. Stroud, T. W. F. Comparing conditional means and variances in a regression 

model with measurement errors of known Variances. ^ Journal of the American 
Statistical Association , 1972, 67^, 407-412. 

Stroud, T. W. F. Comparing regressions when measurement error variances are 
known. Psychometrika , 1974, 39, 53-68. 

Sutcliffe, J. P.' Error of measurement and the sensitivity of a test of 
significande. Psychometrika , 1958, 23^, 9-17. 

Theil, H. Economic for-ecasts an d policy . Amsterdam; North-Holland^, 1958 ♦ 

Theil, H. Principles of econometrics . New York; Wiley, 1971. 

Thomson^ G. H. A formula to correct for- the effect of errors of measurement 
on the correlation of initial values with gains. Journal of Experime ntal 
Psychology , 1924, 7^, 321-324. ^ 

Thorndike, R. L. Regression fallacies in the matched groups experiment. 
' .Psychometrika , 1942, 7^, 85-102. 

Tucker, L. R., Damarin, F., & Messick, S. A basefree measure of change. 
Psychometrika , 1966, 31, 457-473. 

Van de Geer, J. P. Introduction to multivariate analysis for the s ocial 
sciences ; San Francisco: Freeman, 1971. 

Veldman, D. J. Empirical t^sts of statistical assumptions (RMM-9R). Austin, 
Texas; University of Texas, Research and Development Center for Teacher 
Education, 197? , 

Wald, A* Test of statistical hypotheses concerning several parameters when 
the number of jDbjervations is large, transactions of the American 
Mathematical Society , 1943, 54_, 426-481^; 

Walker, H. M., & Lev, J. Statistical inference . New York: Henry Holt & Co., 
1953. 

Warren, R.D., White, J.K., Fuller, W.A. An errors-in-variables analysis 
of managerial' *role performance. Journal of the American Statistical 
Association , 1974, 69, 886-893. 7^ ^ 

sjjeisberg, H. I. Statistical adjustments and uncontrolled studies. 
Psychological Bulletin , 1979, 8i6, 1149-1164.^ 

Werta, C. & Hilton, T. L* Intellectual status and intellectual growth, 
again . American Educational Research Journal, 19 7 7 , 14 , 13 7 -146 . 



184 

R - 12 



Werts, C. & Linn, R. L# A general linear model for studying growth, 
Jsychologice.l Bulletin, 1970, 73, 17-22. . 

Herts, C. £•, & Linn, R. L» Path analysis: Psychological examples • 
Psychological Bulletin , 1970, 74, 193-212. 

Werts, C. E., & Linn, R. L. Analyzing school effects: ANCQVA with a, fallible 
covariate. Educational and Psychological Measurement , 1971, 31^, 95-104. 

Werts,. C.E., Linn, R.L., & J5reskog> R.G. Another perspective on "Linear 

regression, structural relations, 4<^mfla»'Surement error." Educational and 
Psychological Measurement , 1973, 3S^| 3^ -332. 

Wijsman, R. A. Applications, oi a certain representation of the Wishart matrix. 
Annals of Mathematical Statistics , 1959, 30-, 597-601. 

Wiley, ly. E., & Hamlschf eger, A. Post hoc, ergo propter hoc; Problems in 
the attribution of change (Report No. 5). Chicago, 111.: University of 
Chicago, Studies of Educative Processes, 1973. 

Wiley, D. E., & Hornlk, R. Measurement error and the analysis of panel data 
(Report No. 7). Chicago, III.: University of Chicago, Studies of 
Educative Processes, 1973. 

Wiley, D. E., Schmidt, W* H. , & Bramble, W. J. Studies of a class of 
covariance structure models. Journal of the American Statistical 
Association , 1973, 58, 317-323": 

Witkln, H. A., Goodenough, D. R., & Karp, S. A. Stability of cognitive style 
from childhood to young adulthood. Journal of Personality and ..Social 
Psychology , 1967, 7_, 291-300. ~" ~ 

Wolter, K. M. Estimators for a nonlinear functional relationship . 
Unpublished doctoral dissertation, Iowa State University ,/ 1974 . 

Wolter, K. M., & Corby, C. New software for the linear errors-ln-variables 
model. Proceedings of the Americairsta/tlstical Association , 1976. 

Wolter, K. M., & Fuller W. A* Estimation of the quadratic errois-in-variablfes 
model. Unpublished^ report, U.S. Bureau of the Census, Washington, D. C, 
1977 • "/^^^ * 

Wolter, K*. M., & Fuller, W. A. EstimSTRlon of the nonlinear errors-ln-variables 
model . Unpublished report, U.S. Bureau of the Census, Washington, D. C, 
1977. 



r 

R. - 13 



